對于按照select
語法格式書寫的select
語句,它的邏輯執行順序與標準的書寫語序并不相同。本文為您介紹select
語句中操作的執行語序并提供使用示例。
執行語序
在SELECT語法中,涉及的操作主要包括:
select
from
where
group by
having
window
qualify
order by
distribute by
sort by
limit
基于order by
不和distribute by
、sort by
同時使用,group by
也不和distribute by
、sort by
同時使用的限制,常見select
語句的執行順序如下:
場景1:
from
->where
->group by
->having
->select
->order by
->limit
場景2:
from
->where
->select
->distribute by
->sort by
為避免混淆,MaxCompute支持以執行順序書寫查詢語句,語法結構可改為如下形式:
from <table_reference>
[where <where_condition>]
[group by <col_list>]
[having <having_condition>]
[window <window_name> AS (<window_definition>)]
[qualify <expression>]
select [all | distinct] <select_expr>, <select_expr>, ...
[order by <order_condition>]
[distribute by <distribute_condition> [sort by <sort_condition>] ]
[limit <number>]
示例數據
為便于理解,本文為您提供源數據,基于源數據提供相關示例。創建表sale_detail,并添加數據,命令示例如下:
--創建一張分區表sale_detail。
create table if not exists sale_detail
(
shop_name string,
customer_id string,
total_price double
)
partitioned by (sale_date string, region string);
--向源表增加分區。
alter table sale_detail add partition (sale_date='2013', region='china') partition (sale_date='2014', region='shanghai');
--向源表追加數據。
insert into sale_detail partition (sale_date='2013', region='china') values ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
insert into sale_detail partition (sale_date='2014', region='shanghai') values ('null','c5',null),('s6','c6',100.4),('s7','c7',100.5);
查詢分區表sale_detail中的數據,命令示例如下:
set odps.sql.allow.fullscan=true;
select * from sale_detail;
--返回結果。
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
| null | c5 | NULL | 2014 | shanghai |
| s6 | c6 | 100.4 | 2014 | shanghai |
| s7 | c7 | 100.5 | 2014 | shanghai |
+------------+-------------+-------------+------------+------------+
使用示例
示例1,符合場景1的命令示例如下:
說明使用以下命令查詢分區表時,您需要在命令前添加
set odps.sql.allow.fullscan=true;
打開全表掃描或者在命令語句中指定分區。--按照select語法書寫。 set odps.sql.allow.fullscan=true; select region,max(total_price) from sale_detail where total_price > 100 group by region having sum(total_price)>300.5 order by region limit 5; --按照執行順序書寫。與上一寫法等效。 from sale_detail where total_price > 100 group by region having sum(total_price)>300.5 select region,max(total_price) order by region limit 5;
返回結果如下:
+------------+------------+ | region | _c1 | +------------+------------+ | china | 100.3 | +------------+------------+
該命令的執行邏輯如下:
從sale_detail表(
from sale_detail
)中取出滿足where total_price > 100
條件的數據。對于a中得到的結果按照region進行分組(
group by region
)。對于b中得到的結果篩選分組中滿足total_price之和大于305的數據(
having sum(total_price)>305
)。對于c中得到的結果
select region,max(total_price)
。對于d中得到的結果按照region進行排序(
order by region
)。對于e中得到的結果僅顯示前5條數據(
limit 5
)。
示例2,符合場景2的命令示例如下:
--按照select語法書寫。 set odps.sql.allow.fullscan=true; select shop_name ,total_price ,region from sale_detail where total_price > 100.2 distribute by region sort by total_price; --按照執行順序書寫。與上一寫法等效。 from sale_detail where total_price > 100.2 select shop_name ,total_price ,region distribute by region sort by total_price;
返回結果如下:
+------------+-------------+------------+ | shop_name | total_price | region | +------------+-------------+------------+ | s3 | 100.3 | china | | s6 | 100.4 | shanghai | | s7 | 100.5 | shanghai | +------------+-------------+------------+
該命令的執行邏輯如下:
從sale_detail表(
from sale_detail
)中取出滿足where total_price > 100.2
條件的數據。對于a中得到的結果
select shop_name, total_price, region
。對于b中得到的結果按照region進行哈希分片(
distribute by region
)。對于c中得到的結果按照total_price進行升序排列(
sort by total_price
)。