本文介紹如何管理執行計劃,將重復或者復雜查詢的執行計劃長久地保存下來。
背景信息
對于每一條SQL,優化器都會生成相應執行計劃。但是很多情況下,應用請求的SQL都是重復的(僅參數不同),參數化之后的SQL完全相同。這時,可以按照參數化之后的SQL構造一個緩存,將除了參數以外的各種信息(比如執行計劃)緩存起來,稱為執行計劃緩存(Plan Cache)。
另一方面,對于較復雜的查詢(例如涉及到多個表的Join),為了使其執行計劃能保持相對穩定,不因為版本升級等原因發生變化。執行計劃管理(Plan Management)為每個SQL記錄一組執行計劃,該執行計劃會被持久化地保存,即使版本升級也會保留。
工作流程概覽
當PolarDB-X 1.0收到一條查詢SQL時,會經歷以下流程:
- 對查詢SQL進行參數化處理,將所有參數替換為占位符
?
。 - 以參數化的SQL作為Key,查找執行計劃緩存中是否有緩存;如果沒有,則調用優化器進行優化。
- 如果該SQL是簡單查詢,則直接執行,跳過執行計劃管理相關步驟。
- 如果該SQL是復雜查詢,則使用基線(Baseline)中固化的執行計劃;如果有多個,則選擇代價最低的那個。
執行計劃緩存
PolarDB-X 1.0默認開啟執行計劃緩存功能。EXPLAIN結果中的HitCache
表示當前SQL是否命中執行計劃緩存。開啟執行計劃緩存后,PolarDB-X 1.0會對SQL做參數化處理,參數化會將SQL中的常量用占位符?
替換,并構建出相應的參數列表。在執行計劃中也可以看到LogicalView算子的SQL中含有?
。
執行計劃管理
對于復雜SQL,經過執行計劃緩存之后,還會經過執行計劃管理流程。
執行計劃緩存和執行計劃管理都是采用參數化后的SQL作為Key來執行計劃。執行計劃緩存中會緩存所有SQL的執行計劃,而執行計劃管理僅對復雜查詢SQL進行處理。
由于受到具體參數的影響,SQL模版和最優的執行計劃并非一一對應的。
在執行計劃管理中,每一條SQL對應一個基線,每個基線中包含一個或多個執行計劃。實際使用中,會根據當時的參數選擇其中代價最小的執行計劃來執行。
當執行計劃緩存中的執行計劃走進執行計劃管理時,SPM會操作一個流程判斷該執行計劃是否是已知的,是已知的話,是否代價是最小的;不是已知的話,是否需要執行一下以判斷該執行計劃的優化程度。
運維指令PolarDB-X 1.0提供了豐富的指令集用于管理執行計劃,語法如下:
BASELINE (LOAD|PERSIST|CLEAR|VALIDATE|LIST|DELETE) [Signed Integer,Signed Integer....]
BASELINE (ADD|FIX) SQL (HINT Select Statemtnt)
- BASELINE (ADD|FIX) SQL <HINT> <Select Statement>:將SQL以HINT修復過后的執行計劃記錄固定下來。
- BASELINE LOAD:將系統表中指定的基線信息刷新到內存并使其生效。
- BASELINE LOAD_PLAN:將系統表中指定的執行計劃信息刷新到內存并使其生效。
- BASELINE LIST:列出當前所有的基線信息。BASELINE PERSIST:將指定的基線落盤。
- BASELINE PERSIST_PLAN:將指定的執行計劃落盤。
- BASELINE CLEAR:內存中清理某個基線。
- BASELINE CLEAR_PLAN:內存中清理某個執行計劃。
- BASELINE DELETE:磁盤中刪除某個基線。
- BASELINE DELETE_PLAN:磁盤中刪除某個執行計劃。
執行計劃調優實戰
數據發生變化或PolarDB-X 1.0優化器引擎升級后,針對同一條SQL,有可能會出現更好的執行計劃。SPM在自動演化時會將CBO優化自動發現的更優執行計劃加入到SQL的基線中。除此以外,您也可以通過SPM的指令主動優化執行計劃。
例如以下的SQL:
SELECT *
FROM lineitem JOIN part ON l_partkey=p_partkey
WHERE p_name LIKE '%green%';
正常EXPLAIN發現該SQL生成的執行計劃使用的是Hash Join,并且在Baseline List的基線中,該SQL僅有這一個執行計劃:
explain select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true) |
| ParallelHashJoin(condition="l_partkey = p_partkey", type="inner") |
| LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
| LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true) |
| HitCache:true |
| |
| |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.06 sec)
baseline list;
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| BASELINE_ID | PARAMETERIZED_SQL | PLAN_ID | EXTERNALIZED_PLAN | FIXED | ACCEPTED |
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| -399023558 | SELECT *
FROM lineitem
JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? | -935671684 |
Gather(parallel=true)
ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
| 0 | 1 |
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
1 row in set (0.02 sec)
假如這個SQL在某些條件下采用BKA Join(Lookup Join)會有更好的性能,那么首先需要想辦法利用HINT引導PolarDB-X 1.0生成符合預期的執行計劃。BKA Join的HINT格式為:
/*+TDDL:BKA_JOIN(lineitem, part)*/
通過EXPLAIN [HINT] [SQL]
觀察出來的執行計劃是否符合預期:
explain /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true) |
| ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner") |
| LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
| Gather(concurrent=true) |
| LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)") |
| HitCache:false |
| |
| |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.14 sec)
注意此時由于Hint的干預,Join的算法已修正為BKA Join。但是這并不會對基線造成變動,如果想以后每次遇到這條SQL都使用上面的計劃,還需要將其加入到基線中。
可以采用執行計劃管理的Baseline Add
指令為該SQL增加一個執行計劃。這樣就會同時有兩套執行計劃存在于該SQL的基線中,CBO優化器會根據代價選擇一個執行計劃執行。
baseline add sql /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+-------------+--------+
| BASELINE_ID | STATUS |
+-------------+--------+
| -399023558 | OK |
+-------------+--------+
1 row in set (0.09 sec)
baseline list;
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| BASELINE_ID | PARAMETERIZED_SQL | PLAN_ID | EXTERNALIZED_PLAN | FIXED | ACCEPTED |
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| -399023558 | SELECT *
FROM lineitem
JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? | -1024543942 |
Gather(parallel=true)
ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")
LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
Gather(concurrent=true)
LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")
| 0 | 1 |
| -399023558 | SELECT *
FROM lineitem
JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? | -935671684 |
Gather(parallel=true)
ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
| 0 | 1 |
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
2 rows in set (0.03 sec)
通過以上Baseline List
指令展示出來的結果,可以看到基于BKA_JOIN的執行計劃已增加到該SQL的基線中。此時EXPLAIN這條SQL,發現隨SQL中p_name LIKE ?
條件變化,PolarDB-X 1.0會選擇不同的執行計劃。如果想讓PolarDB-X 1.0固定使用上述的執行計劃(而非在兩個中挑選一個),可以采用Baseline Fix
指令強制PolarDB-X 1.0走指定的執行計劃。
baseline fix sql /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+-------------+--------+
| BASELINE_ID | STATUS |
+-------------+--------+
| -399023558 | OK |
+-------------+--------+
1 row in set (0.07 sec)
baseline list\G
*************************** 1. row ***************************
BASELINE_ID: -399023558
PARAMETERIZED_SQL: SELECT *
FROM lineitem
JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ?
PLAN_ID: -1024543942
EXTERNALIZED_PLAN:
Gather(parallel=true)
ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")
LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
Gather(concurrent=true)
LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")
FIXED: 1
ACCEPTED: 1
*************************** 2. row ***************************
BASELINE_ID: -399023558
PARAMETERIZED_SQL: SELECT *
FROM lineitem
JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ?
PLAN_ID: -935671684
EXTERNALIZED_PLAN:
Gather(parallel=true)
ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
FIXED: 0
ACCEPTED: 1
2 rows in set (0.01 sec)
Baseline Fix
指令執行完后,可以看到BKA Join執行計劃的Fix
狀態位已被置為1。此時就算不加HINT,任意條件下Explain這條SQL,都一定會采用這個執行計劃。
explain select * from lineitem join part on l_partkey=p_partkey where p_name like '%green%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true) |
| ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner") |
| LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
| Gather(concurrent=true) |
| LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)") |
| HitCache:true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)