通過EXPLAIN和EXPLAIN ANALYZE分析執行計劃
本文介紹如何使用EXPLAIN
和EXPLAIN ANALYZE
命令來分析查詢執行計劃。
前提條件
AnalyticDB MySQL版集群需為3.1.3或以上版本。
EXPLAIN
您可以通過EXPLAIN
命令來評估查詢語句的執行方式,評估結果僅供參考,并不等于實際的執行結果。
語法
EXPLAIN (format text) <SELECT statement>;
說明如果查詢SQL不復雜,您可以在
EXPLAIN
命令中加上(format text)
,來提高返回結果中計劃樹層次結構的易讀性。示例
EXPLAIN (format text) SELECT count(*) FROM nation, region, customer WHERE c_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA';
返回結果如下:
Output[count(*)] │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ count(*) := count └─ Aggregate(FINAL) │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ count := count(`count_1`) └─ LocalExchange[SINGLE] () │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} └─ RemoteExchange[GATHER] │ Outputs: [count_0_2:bigint] │ Estimates: {rows: 1 (8B)} └─ Aggregate(PARTIAL) │ Outputs: [count_0_4:bigint] │ Estimates: {rows: 1 (8B)} │ count_4 := count(*) └─ InnerJoin[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue_0_6] │ Outputs: [] │ Estimates: {rows: 302035 (4.61MB)} │ Distribution: REPLICATED ├─ Project[] │ │ Outputs: [c_nationkey:integer, $hashvalue:bigint] │ │ Estimates: {rows: 1500000 (5.72MB)} │ │ $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0)) │ └─ RuntimeFilter │ │ Outputs: [c_nationkey:integer] │ │ Estimates: {rows: 1500000 (5.72MB)} │ ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}] │ │ Outputs: [c_nationkey:integer] │ │ Estimates: {rows: 1500000 (5.72MB)} │ │ c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true} │ └─ RuntimeCollect │ │ Outputs: [n_nationkey:integer] │ │ Estimates: {rows: 5 (60B)} │ └─ LocalExchange[ROUND_ROBIN] () │ │ Outputs: [n_nationkey:integer] │ │ Estimates: {rows: 5 (60B)} │ └─ RuntimeScan │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} └─ LocalExchange[HASH][$hashvalue_0_6] ("n_nationkey") │ Outputs: [n_nationkey:integer, $hashvalue_0_6:bigint] │ Estimates: {rows: 5 (60B)} └─ Project[] │ Outputs: [n_nationkey:integer, $hashvalue_0_10:bigint] │ Estimates: {rows: 5 (60B)} │ $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0)) └─ RemoteExchange[REPLICATE] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} └─ InnerJoin[(`n_regionkey` = `r_regionkey`)][$hashvalue_0_7, $hashvalue_0_8] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} │ Distribution: REPLICATED ├─ Project[] │ │ Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue_0_7:bigint] │ │ Estimates: {rows: 25 (200B)} │ │ $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0)) │ └─ RuntimeFilter │ │ Outputs: [n_nationkey:integer, n_regionkey:integer] │ │ Estimates: {rows: 25 (200B)} │ ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}] │ │ Outputs: [n_nationkey:integer, n_regionkey:integer] │ │ Estimates: {rows: 25 (200B)} │ │ n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true} │ │ n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true} │ └─ RuntimeCollect │ │ Outputs: [r_regionkey:integer] │ │ Estimates: {rows: 1 (4B)} │ └─ LocalExchange[ROUND_ROBIN] () │ │ Outputs: [r_regionkey:integer] │ │ Estimates: {rows: 1 (4B)} │ └─ RuntimeScan │ Outputs: [r_regionkey:integer] │ Estimates: {rows: 1 (4B)} └─ LocalExchange[HASH][$hashvalue_0_8] ("r_regionkey") │ Outputs: [r_regionkey:integer, $hashvalue_0_8:bigint] │ Estimates: {rows: 1 (4B)} └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}] Outputs: [r_regionkey:integer, $hashvalue_0_9:bigint] Estimates: {rows: 1 (4B)}/{rows: 1 (B)} $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0)) r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true}
返回結果中的主要參數說明見下表。
參數
說明
Outputs: [symbol:type]
每個算子的輸出列及數據類型。
Estimates: {rows: %s (%sB)}
每個算子的估算行數及數據量。估算結果可用來決定優化器的Join Order和Data Shuffle。
EXPLAIN ANALYZE
您可以通過EXPLAIN ANALYZE
命令查看查詢的分布式執行計劃以及實際執行代價,包括執行耗時、內存使用量,輸入輸出數據量等。
語法
EXPLAIN ANALYZE <SELECT statement>;
示例
EXPLAIN ANALYZE SELECT count(*) FROM nation, region, customer WHERE c_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA';
返回結果如下:
Fragment 1 [SINGLE] Output: 1 row (9B), PeakMemory: 178KB, WallTime: 1.00ns, Input: 32 rows (288B); per task: avg.: 32.00 std.dev.: 0.00 Output layout: [count] Output partitioning: SINGLE [] Aggregate(FINAL) │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 2 rows (18B), PeakMemory: 24B (0.00%), WallTime: 70.39us (0.03%) │ count := count(`count_1`) └─ LocalExchange[SINGLE] () │ Outputs: [count1:bigint] │ Estimates: {rows: ? (?)} │ Output: 64 rows (576B), PeakMemory: 8KB (0.07%), WallTime: 238.69us (0.10%) └─ RemoteSource[2] Outputs: [count2:bigint] Estimates: Output: 32 rows (288B), PeakMemory: 32KB (0.27%), WallTime: 182.82us (0.08%) Input avg.: 4.00 rows, Input std.dev.: 264.58% Fragment 2 [adb:AdbPartitioningHandle{schema=tpch, tableName=customer, dimTable=false, shards=32, tableEngineType=Cstore, partitionColumns=c_custkey, prunedBuckets= empty}] Output: 32 rows (288B), PeakMemory: 6MB, WallTime: 164.00ns, Input: 1500015 rows (20.03MB); per task: avg.: 500005.00 std.dev.: 21941.36 Output layout: [count4] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count4:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 64 rows (576B), PeakMemory: 336B (0.00%), WallTime: 1.01ms (0.42%) │ count_4 := count(*) └─ INNER Join[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue6] │ Outputs: [] │ Estimates: {rows: 302035 (4.61MB)} │ Output: 300285 rows (210B), PeakMemory: 641KB (5.29%), WallTime: 99.08ms (41.45%) │ Left (probe) Input avg.: 46875.00 rows, Input std.dev.: 311.24% │ Right (build) Input avg.: 0.63 rows, Input std.dev.: 264.58% │ Distribution: REPLICATED ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}] │ Outputs: [c_nationkey:integer, $hashvalue:bigint] │ Estimates: {rows: 1500000 (5.72MB)}/{rows: 1500000 (5.72MB)} │ Output: 1500000 rows (20.03MB), PeakMemory: 5MB (44.38%), WallTime: 68.29ms (28.57%) │ $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0)) │ c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true} │ Input: 1500000 rows (7.15MB), Filtered: 0.00% └─ LocalExchange[HASH][$hashvalue6] ("n_nationkey") │ Outputs: [n_nationkey:integer, $hashvalue6:bigint] │ Estimates: {rows: 5 (60B)} │ Output: 30 rows (420B), PeakMemory: 394KB (3.26%), WallTime: 455.03us (0.19%) └─ Project[] │ Outputs: [n_nationkey:integer, $hashvalue10:bigint] │ Estimates: {rows: 5 (60B)} │ Output: 15 rows (210B), PeakMemory: 24KB (0.20%), WallTime: 83.61us (0.03%) │ Input avg.: 0.63 rows, Input std.dev.: 264.58% │ $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0)) └─ RemoteSource[3] Outputs: [n_nationkey:integer] Estimates: Output: 15 rows (75B), PeakMemory: 24KB (0.20%), WallTime: 45.97us (0.02%) Input avg.: 0.63 rows, Input std.dev.: 264.58% Fragment 3 [adb:AdbPartitioningHandle{schema=tpch, tableName=nation, dimTable=true, shards=32, tableEngineType=Cstore, partitionColumns=, prunedBuckets= empty}] Output: 5 rows (25B), PeakMemory: 185KB, WallTime: 1.00ns, Input: 26 rows (489B); per task: avg.: 26.00 std.dev.: 0.00 Output layout: [n_nationkey] Output partitioning: BROADCAST [] INNER Join[(`n_regionkey` = `r_regionkey`)][$hashvalue7, $hashvalue8] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} │ Output: 11 rows (64B), PeakMemory: 152KB (1.26%), WallTime: 255.86us (0.11%) │ Left (probe) Input avg.: 25.00 rows, Input std.dev.: 0.00% │ Right (build) Input avg.: 0.13 rows, Input std.dev.: 264.58% │ Distribution: REPLICATED ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}] │ Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue7:bigint] │ Estimates: {rows: 25 (200B)}/{rows: 25 (200B)} │ Output: 25 rows (475B), PeakMemory: 16KB (0.13%), WallTime: 178.81us (0.07%) │ $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0)) │ n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true} │ n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true} │ Input: 25 rows (250B), Filtered: 0.00% └─ LocalExchange[HASH][$hashvalue8] ("r_regionkey") │ Outputs: [r_regionkey:integer, $hashvalue8:bigint] │ Estimates: {rows: 1 (4B)} │ Output: 2 rows (28B), PeakMemory: 34KB (0.29%), WallTime: 57.41us (0.02%) └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}] Outputs: [r_regionkey:integer, $hashvalue9:bigint] Estimates: {rows: 1 (4B)}/{rows: 1 (4B)} Output: 1 row (14B), PeakMemory: 8KB (0.07%), WallTime: 308.99us (0.13%) $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0)) r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true} Input: 1 row (5B), Filtered: 0.00%
返回結果中的主要參數說明見下表。
參數
說明
Outputs: [symbol:type]
每個算子的輸出列及數據類型。
Estimates: {rows: %s (%sB)}
每個算子的估算行數及數據量。估算結果可用來決定優化器的Join Order和Data Shuffle。
PeakMemory: %s
內存使用總和,用于分析內存使用的瓶頸點。
WallTime: %s
算子執行時間的累加總和,用于分析計算瓶頸點。
說明由于存在并行計算,所以該時間并不是真實的執行時間。
Input: %s rows (%sB)
輸入行數及數據量。
per task: avg.: %s std.dev.: %s
平均行數和其標準差,用于分析Stage內部的數據傾斜。
Output: %s row (%sB)
輸出行數及數據量。
使用場景
您可以通過EXPLAIN ANALYZE
分析一些常見的計劃問題。
過濾器未下推
在如下兩個查詢中,相較于SQL 1,SQL 2中由于存在不能下推的函數
length(string_test)
,需要掃描全量數據進行計算:SQL 1
SELECT count(*) FROM test WHERE string_test = 'a';
SQL 2
SELECT count(*) FROM test WHERE length(string_test) = 1;
使用
EXPLAIN ANALYZE
分別分析上述兩個查詢的執行計劃,對比計劃中的Fragment 2,可以看出:SQL 1使用的算子是
TableScan
,且Input avg.
為0.00 rows
,說明過濾器下推成功,掃描數據量為0行。Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}] Output: 4 rows (36B), PeakMemory: 0B, WallTime: 6.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00 Output layout: [count_0_1] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 212.92us (3.99%) │ count_0_1 := count(*) └─ TableScan[adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}] Outputs: [] Estimates: {rows: 4 (0B)} Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 4.76ms (89.12%) Input avg.: 0.00 rows, Input std.dev.: ?%
SQL 2使用的算子是
ScanFilterProject
,且Input
為9999 rows
,同時,filterPredicate
屬性不為空(即filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1')
)表明沒有下推的過濾器,掃描數據量為9999行。Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}] Output: 4 rows (36B), PeakMemory: 0B, WallTime: 102.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00 Output layout: [count_0_1] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 252.23us (0.12%) │ count_0_1 := count(*) └─ ScanFilterProject[table = adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}, filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1')] Outputs: [] Estimates: {rows: 9999 (312.47kB)}/{rows: 9999 (312.47kB)}/{rows: ? (?)} Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 101.31ms (49.84%) string_test := AdbColumnHandle{columnName=string_test, type=13, isIndexed=true} Input: 9999 rows (110.32kB), Filtered: 100.00%
Bad SQL內存使用率
您可以直接查看每個Fragment的
PeakMemory
來定位資源消耗的問題。排除計劃中Disaster Broadcast的情況,高PeakMemory
通常是因為連接數據膨脹、連接的左表數據量過大、TableScan
算子掃描數據量過大等,需要從業務角度加條件限制數據量。另外,您也可以通過查看每個算子的PeakMemory
百分比定位資源消耗最大的算子,再進一步分析。