日本熟妇hd丰满老熟妇,中文字幕一区二区三区在线不卡 ,亚洲成片在线观看,免费女同在线一区二区

窗口函數

更新時間:

窗口函數常用于計算分組排名,移動平均,累計和等復雜計算。本文介紹云原生數據倉庫 AnalyticDB MySQL 版窗口函數的用法與示例。

  • 聚合函數

  • 排序函數

    • CUME_DIST:返回一組數值中每個值的累計分布。

    • RANK:返回數據集中每個值的排名。

    • DENSE_RANK:返回一組數值中每個數值的排名。

    • NTILE:將每個窗口分區的數據分散到桶號從1到n的n個桶中。

    • ROW_NUMBER:根據行在窗口分區內的順序,為每行數據返回一個唯一的有序行號,行號從1開始。

    • PERCENT_RANK:返回數據集中每個數據的排名百分比,其結果由(r - 1) / (n - 1)計算得出。其中r為RANK()計算的當前行排名, n為當前窗口分區內總的行數。

  • 值函數

    • FIRST_VALUE:返回窗口分區第1行的值。

    • LAST_VALUE返回窗口分區最后1行的值。

    • LAG:返回窗口內距離當前行之前偏移offset后的值。

    • LEAD:返回窗口內距離當前行偏移offset后的值。

    • NTH_VALUE:返回窗口內偏移指定offset后的值,偏移量從1開始。

概述

窗口函數基于查詢結果的行數據進行計算,窗口函數運行在HAVING子句之后、 ORDER BY子句之前。窗口函數需要特殊的關鍵字OVER子句來指定窗口即觸發一個窗口函數。

分析型數據庫MySQL版支持三種類型的窗口函數:聚合函數、排序函數和值函數。

語法

function over ([partition by a] order by b RANGE|ROWS BETWEEN start AND end)                

窗口函數包含以下三個部分。

  • 分區規范(可選):用于將輸入行分散到不同的分區中,過程和GROUP BY子句的分散過程相似。

  • 排序規范:決定輸入數據行在窗口函數中執行的順序。

  • 窗口區間:指定計算數據的窗口邊界。

    窗口區間支持RANGE、ROWS兩種模式:

    • RANGE按照計算列值的范圍進行定義。

    • ROWS按照計算列的行數進行范圍定義。

    • RANGE、ROWS中可以使用BETWEEN start AND end指定邊界可取值。BETWEEN start AND end取值為:

      • CURRENT ROW,當前行。

      • N PRECEDING,前n行。

      • UNBOUNDED PRECEDING,直到第1行。

      • N FOLLOWING,后n行。

      • UNBOUNDED FOLLOWING,直到最后1行。

例如,以下查詢根據當前窗口的每行數據計算profit的部分總和。

select year,country,profit,sum(profit) over (partition by country order by year ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) as slidewindow from testwindow;
+------+---------+--------+-------------+
| year | country | profit | slidewindow |
+------+---------+--------+-------------+
| 2001 | USA     |     50 |          50 |
| 2001 | USA     |   1500 |        1550 |
| 2000 | Germany |     75 |          75 |
| 2000 | Germany |     75 |         150 |
| 2001 | Germany |     79 |         229 |
| 2000 | Finland |   1500 |        1500 |
| 2001 | Finland |     10 |        1510 |        

而以下查詢只能計算出profit的總和。

select country,sum(profit) over (partition by country) from testwindow;
+---------+-----------------------------------------+
| country | sum(profit) OVER (PARTITION BY country) |
+---------+-----------------------------------------+
| Germany |                                     229 |
| Germany |                                     229 |
| Germany |                                     229 |
| USA     |                                    1550 |
| USA     |                                    1550 |
| Finland |                                    1510 |
| Finland |                                    1510 |        

注意事項

邊界值的取值有如下要求:

  • start不能為UNBOUNDED FOLLOWING,否則提示Window frame start cannot be UNBOUNDED FOLLOWING錯誤。

  • end不能為UNBOUNDED PRECEDING,否則提示Window frame end cannot be UNBOUNDED PRECEDING錯誤。

  • startCURRENT ROW并且endN PRECEDING時,將提示Window frame starting from CURRENT ROW cannot end with PRECEDING錯誤。

  • startN FOLLOWING并且endN PRECEDING時,將提示Window frame starting from FOLLOWING cannot end with PRECEDING錯誤。

  • startN FOLLOWING并且endCURRENT ROW,將提示Window frame starting from FOLLOWING cannot end with CURRENT ROW錯誤。

當模式為RANGE時:

  • start或者endN PRECEDING時,將提示Window frame RANGE PRECEDING is only supported with UNBOUNDED錯誤。

  • start或者endN FOLLOWING時,將提示Window frame RANGE FOLLOWING is only supported with UNBOUNDED錯誤。

準備工作

本文中的窗口函數均以testwindow表為測試數據。

create table testwindow(year int, country varchar(20), product varchar(20), profit int) distributed by hash(year);        
insert into testwindow values (2000,'Finland','Computer',1500);
insert into testwindow values (2001,'Finland','Phone',10);
insert into testwindow values (2000,'Germany','Calculator',75);
insert into testwindow values (2000,'Germany','Calculator',75);
insert into testwindow values (2001,'Germany','Calculator',79);
insert into testwindow values (2001,'USA','Calculator',50);
insert into testwindow values (2001,'USA','Computer',1500);        
SELECT * FROM testwindow;
+------+---------+------------+--------+
| year | country | product    | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer   |   1500 |
| 2001 | Finland | Phone      |     10 |
| 2000 | Germany | Calculator |     75 |
| 2000 | Germany | Calculator |     75 |
| 2001 | Germany | Calculator |     79 |
| 2001 | USA     | Calculator |     50 |
| 2001 | USA     | Computer   |   1500 |        

聚合函數

所有聚合函數都可以通過添加OVER子句來作為窗口函數使用,聚合函數將基于當前滑動窗口內的數據行計算每一行數據。

例如,通過以下查詢循環顯示每個店員每天的訂單額總和。

SELECT clerk, orderdate, orderkey, totalprice,sum(totalprice) OVER (PARTITION BY clerk ORDER BY orderdate) AS rolling_sum FROM orders ORDER BY clerk, orderdate, orderkey            

CUME_DIST

CUME_DIST()           
  • 命令說明:返回一組數值中每個值的累計分布。

    返回結果:在窗口分區中對窗口進行排序后的數據集,包括當前行和當前行之前的數據行數。排序中任何關聯值均會計算成相同的分布值。

  • 返回值類型:DOUBLE。

  • 示例:

    select year,country,product,profit,cume_dist() over (partition by country order by profit) as cume_dist from testwindow;
    +------+---------+------------+--------+--------------------+
    | year | country | product    | profit | cume_dist          |
    +------+---------+------------+--------+--------------------+
    | 2001 | USA     | Calculator |     50 |                0.5 |
    | 2001 | USA     | Computer   |   1500 |                1.0 |
    | 2001 | Finland | Phone      |     10 |                0.5 |
    | 2000 | Finland | Computer   |   1500 |                1.0 |
    | 2000 | Germany | Calculator |     75 | 0.6666666666666666 |
    | 2000 | Germany | Calculator |     75 | 0.6666666666666666 |
    | 2001 | Germany | Calculator |     79 |                1.0 |                

RANK

RANK()            
  • 命令說明:返回數據集中每個值的排名。

    排名值是將當前行之前的行數加1,不包含當前行。因此,排序的關聯值可能產生順序上的空隙,而且這個排名會對每個窗口分區進行計算。

  • 返回值類型:BIGINT。

  • 示例:

    select year,country,product,profit,rank() over (partition by country order by profit) as rank from testwindow;
    +------+---------+------------+--------+------+
    | year | country | product    | profit | rank |
    +------+---------+------------+--------+------+
    | 2001 | Finland | Phone      |     10 |    1 |
    | 2000 | Finland | Computer   |   1500 |    2 |
    | 2001 | USA     | Calculator |     50 |    1 |
    | 2001 | USA     | Computer   |   1500 |    2 |
    | 2000 | Germany | Calculator |     75 |    1 |
    | 2000 | Germany | Calculator |     75 |    1 |
    | 2001 | Germany | Calculator |     79 |    3 |                    

DENSE_RANK

DENSE_RANK()            
  • 命令說明:返回一組數值中每個數值的排名。

    DENSE_RANK()RANK()功能相似,但是DENSE_RANK()關聯值不會產生順序上的空隙。

  • 返回值類型:BIGINT。

  • 示例:

    select year,country,product,profit,dense_rank() over (partition by country order by profit) as dense_rank from testwindow;
    +------+---------+------------+--------+------------+
    | year | country | product    | profit | dense_rank |
    +------+---------+------------+--------+------------+
    | 2001 | Finland | Phone      |     10 |          1 |
    | 2000 | Finland | Computer   |   1500 |          2 |
    | 2001 | USA     | Calculator |     50 |          1 |
    | 2001 | USA     | Computer   |   1500 |          2 |
    | 2000 | Germany | Calculator |     75 |          1 |
    | 2000 | Germany | Calculator |     75 |          1 |
    | 2001 | Germany | Calculator |     79 |          2 |                   

NTILE

NTILE(n)            
  • 命令說明:將每個窗口分區的數據分散到桶號從1nn個桶中。

    桶號值最多間隔1,如果窗口分區中的數據行數不能均勻地分散到每一個桶中,則剩余值將從第1個桶開始,每1個桶分1行數據。例如,有6行數據和4個桶, 最終桶號值為1 1 2 2 3 4

  • 返回值類型:BIGINT。

  • 示例:

    select year,country,product,profit,ntile(2) over (partition by country order by profit) as ntile2 from testwindow;
    +------+---------+------------+--------+--------+
    | year | country | product    | profit | ntile2 |
    +------+---------+------------+--------+--------+
    | 2001 | USA     | Calculator |     50 |      1 |
    | 2001 | USA     | Computer   |   1500 |      2 |
    | 2001 | Finland | Phone      |     10 |      1 |
    | 2000 | Finland | Computer   |   1500 |      2 |
    | 2000 | Germany | Calculator |     75 |      1 |
    | 2000 | Germany | Calculator |     75 |      1 |
    | 2001 | Germany | Calculator |     79 |      2 |                    

ROW_NUMBER

ROW_NUMBER()            
  • 命令說明:根據行在窗口分區內的順序,為每行數據返回一個唯一的有序行號,行號從1開始。

  • 返回值類型:BIGINT。

  • 示例:

    SELECT year, country, product, profit, ROW_NUMBER() OVER(PARTITION BY country) AS row_num1 FROM testwindow;
    +------+---------+------------+--------+----------+
    | year | country | product    | profit | row_num1 |
    +------+---------+------------+--------+----------+
    | 2001 | USA     | Calculator |     50 |        1 |
    | 2001 | USA     | Computer   |   1500 |        2 |
    | 2000 | Germany | Calculator |     75 |        1 |
    | 2000 | Germany | Calculator |     75 |        2 |
    | 2001 | Germany | Calculator |     79 |        3 |
    | 2000 | Finland | Computer   |   1500 |        1 |
    | 2001 | Finland | Phone      |     10 |        2 |                    

PERCENT_RANK

PERCENT_RANK()            
  • 命令說明:返回數據集中每個數據的排名百分比,其結果由(r - 1) / (n - 1)計算得出。其中,rRANK()計算的當前行排名, n為當前窗口分區內總的行數。

  • 返回值類型:DOUBLE。

  • 示例:

    select year,country,product,profit,PERCENT_RANK() over (partition by country order by profit) as ntile3 from testwindow;
    +------+---------+------------+--------+--------+
    | year | country | product    | profit | ntile3 |
    +------+---------+------------+--------+--------+
    | 2001 | Finland | Phone      |     10 |    0.0 |
    | 2000 | Finland | Computer   |   1500 |    1.0 |
    | 2001 | USA     | Calculator |     50 |    0.0 |
    | 2001 | USA     | Computer   |   1500 |    1.0 |
    | 2000 | Germany | Calculator |     75 |    0.0 |
    | 2000 | Germany | Calculator |     75 |    0.0 |
    | 2001 | Germany | Calculator |     79 |    1.0 |                    

FIRST_VALUE

FIRST_VALUE(x)        
  • 命令說明:返回窗口分區第一行的值。

  • 返回值類型:與輸入參數類型相同。

  • 示例:

    select year,country,product,profit,first_value(profit) over (partition by country order by profit) as firstValue from testwindow;
    +------+---------+------------+--------+------------+
    | year | country | product    | profit | firstValue |
    +------+---------+------------+--------+------------+
    | 2000 | Germany | Calculator |     75 |         75 |
    | 2000 | Germany | Calculator |     75 |         75 |
    | 2001 | Germany | Calculator |     79 |         75 |
    | 2001 | USA     | Calculator |     50 |         50 |
    | 2001 | USA     | Computer   |   1500 |         50 |
    | 2001 | Finland | Phone      |     10 |         10 |
    | 2000 | Finland | Computer   |   1500 |         10 |                

LAST_VALUE

LAST_VALUE(x)            
  • 命令說明:返回窗口分區最后一行的值。LAST_VALUE默認統計范圍是 rows between unbounded preceding and current row,即取當前行數據與當前行之前的數據進行比較。如果像FIRST_VALUE那樣直接在每行數據中顯示最后一行數據,需要在 order by 條件的后面加上語句:rows between unbounded preceding and unbounded following。

  • 返回值類型:與輸入參數類型相同。

  • 示例1:

    select year,country,product,profit,last_value(profit) over (partition by country order by profit) as firstValue from testwindow;
    +----------------+-------------------+-------------------+------------------+----------------------+
    | year           | country           | product           | profit           | firstValue           |
    +----------------+-------------------+-------------------+------------------+----------------------+
    |           2001 | USA               | Calculator        |               50 |                   50 |
    |           2001 | USA               | Computer          |             1500 |                 1500 |
    |           2001 | Finland           | Phone             |               10 |                   10 |
    |           2000 | Finland           | Computer          |             1500 |                 1500 |
    |           2000 | Germany           | Calculator        |               75 |                   75 |
    |           2000 | Germany           | Calculator        |               75 |                   75 |
    |           2001 | Germany           | Calculator        |               79 |                   79 |                 
  • 示例2:

    select year,country,product,profit,last_value(profit) over (partition by country order by profitrows between unbounded preceding and unbounded following) as lastValue from testwindow;
    +------+---------+-----------+--------+-----------+
    | year | country | product   | profit | lastValue |
    +------+---------+-----------+--------+-----------+
    | 2001 | Finland | Phone     |   10   |   1500    |
    | 2000 | Finland | Computer  |  1500  |   1500    |
    | 2000 | Germany | Calculator|   75   |    79     |
    | 2000 | Germany | Calculator|   75   |    79     |
    | 2001 | Germany | Calculator|   79   |    79     |
    | 2001 | USA     | Calculator|   50   |   1500    |
    | 2001 | USA     | Computer  |  1500  |   1500    |
    +------+---------+-----------+--------+-----------+

LAG

LAG(x[, offset[, default_value]])           
  • 命令說明:返回窗口內距離當前行之前偏移offset后的值。

    偏移量起始值是0,也就是當前數據行。偏移量可以是標量表達式,默認offset1 。

    如果偏移量的值是null或者大于窗口長度,則返回default_value;如果沒有指定default_value,則返回null。

  • 返回值類型:與輸入參數類型相同。

  • 示例:

    select year,country,product,profit,lag(profit) over (partition by country order by profit) as lag from testwindow;
    +------+---------+------------+--------+------+
    | year | country | product    | profit | lag  |
    +------+---------+------------+--------+------+
    | 2001 | USA     | Calculator |     50 | NULL |
    | 2001 | USA     | Computer   |   1500 |   50 |
    | 2000 | Germany | Calculator |     75 | NULL |
    | 2000 | Germany | Calculator |     75 |   75 |
    | 2001 | Germany | Calculator |     79 |   75 |
    | 2001 | Finland | Phone      |     10 | NULL |
    | 2000 | Finland | Computer   |   1500 |   10 |                    

LEAD

LEAD(x[,offset[, default_value]])            
  • 命令說明:返回窗口內距離當前行偏移offset后的值。

    偏移量offset起始值是0,也就是當前數據行。偏移量可以是標量表達式,默認offset1 。

    如果偏移量的值是null或者大于窗口長度,則返回default_value;如果沒有指定default_value,則返回null。

  • 返回值類型:與輸入參數類型相同。

  • 示例:

    select year,country,product,profit,lead(profit) over (partition by country order by profit) as lead from testwindow;
    +------+---------+------------+--------+------+
    | year | country | product    | profit | lead |
    +------+---------+------------+--------+------+
    | 2000 | Germany | Calculator |     75 |   75 |
    | 2000 | Germany | Calculator |     75 |   79 |
    | 2001 | Germany | Calculator |     79 | NULL |
    | 2001 | Finland | Phone      |     10 | 1500 |
    | 2000 | Finland | Computer   |   1500 | NULL |
    | 2001 | USA     | Calculator |     50 | 1500 |
    | 2001 | USA     | Computer   |   1500 | NULL |                    

NTH_VALUE

NTH_VALUE(x, offset)            
  • 命令說明:返回窗口內偏移指定offset后的值,偏移量從1開始。

    如果偏移量offsetnull或者大于窗口內值的個數,則返回null;如果偏移量offset0或者負數,則系統提示報錯。

  • 返回值類型:與輸入參數類型相同。

  • 示例:

    select year,country,product,profit,nth_value(profit,1) over (partition by country order by profit) as nth_value from testwindow;
    +------+---------+------------+--------+-----------+
    | year | country | product    | profit | nth_value |
    +------+---------+------------+--------+-----------+
    | 2001 | Finland | Phone      |     10 |        10 |
    | 2000 | Finland | Computer   |   1500 |        10 |
    | 2001 | USA     | Calculator |     50 |        50 |
    | 2001 | USA     | Computer   |   1500 |        50 |
    | 2000 | Germany | Calculator |     75 |        75 |
    | 2000 | Germany | Calculator |     75 |        75 |
    | 2001 | Germany | Calculator |     79 |        75 |