當您進行復雜查詢并期望快速獲得查詢結果時,可以利用云數據庫RDS PostgreSQL的AP加速引擎(rds_duckdb)。該引擎提供了列存表和向量化執行能力,顯著提升復雜查詢的執行速度,且無需修改原始查詢語句,從而確保您能夠方便且高效地獲取結果。
您可以加入RDS PostgreSQL插件交流釘釘群(103525002795),進行咨詢、交流和反饋,獲取更多關于插件的信息。
功能簡介
rds_duckdb在RDS PostgreSQL中引入了高效、資源友好的DuckDB,以增強分析型查詢能力。該插件可以將RDS PostgreSQL中的本地表導出為列存表,并啟用分析型查詢加速(Analytical Processing Query Acceleration,簡稱AP)功能,顯著提升了復雜查詢的執行速度,從而更好地滿足分析型業務的需求。
前提條件
實例大版本為RDS PostgreSQL 12及以上。
實例內核小版本為20241030及以上。
已將rds_duckdb添加到shared_preload_libraries的運行參數值中。
配置參數的詳情操作請參見設置實例參數。例如,將運行參數值改為
'pg_stat_statements,auto_explain,rds_duckdb'
。
注意事項
暫不支持RDS PostgreSQL主備之間同步導出的列存表數據,
暫不支持導出列存表數據進行自動增量同步。
創建和刪除插件
使用高權限賬戶進行插件的創建與刪除操作。
創建插件
CREATE EXTENSION rds_duckdb;
查看插件使用的DuckDB內核版本
SELECT rds_duckdb.duckdb_version();
刪除插件
DROP EXTENSION rds_duckdb;
管理列存表
創建列存表
使用以下命令,將RDS PostgreSQL本地表(用戶表、物化視圖、外表等)導出為一份列存表,該列存表將用于加速分析型查詢。
SELECT rds_duckdb.create_duckdb_table('本地表名稱');
刷新列存表
使用以下命令,依據RDS PostgreSQL本地表的最新數據刷新導出的列存表,同時更新表結構信息和數據內容。
SELECT rds_duckdb.refresh_duckdb_table('本地表名稱');
查看列存表大小
SELECT rds_duckdb.duckdb_table_size('本地表名稱');
查看當前數據庫中所有導出表大小
SELECT rds_duckdb.duckdb_database_size();
刪除列存表
SELECT rds_duckdb.drop_duckdb_table('本地表名稱');
管理AP加速
rds_duckdb目前支持加速只讀查詢。開啟AP加速后,當SQL類型為查詢且涉及的表均有對應的DuckDB列存表時,SQL將由DuckDB執行以實現加速。如果SQL屬于暫不支持的DML、DDL操作或包含不存在的列存表,則將回退到RDS PostgreSQL中執行。
對于回退到RDS PostgreSQL執行的SQL,系統會給出警告提示,格式為:WARNING: Trying to execute an operation with non-duckdb tables(test), fallback to PG
。其中,括號內顯示的是不包含對應DuckDB列存表的RDS PostgreSQL表。
非只讀的SQL查詢也會收到提示,顯示為:WARNING: Modification operations on DuckDB tables are currently not supported, fallback to PG
。
開啟AP加速
SET rds_duckdb.execution = on;
設置AP加速參數
您可以在會話中通過調整參數配置來實現對AP加速性能的控制。例如:
SET rds_duckdb.worker_threads = 32;
SET rds_duckdb.memory_limit = 16384;
參數名稱 | 參數說明 | 取值建議 |
rds_duckdb.worker_threads | AP加速時使用的工作線程數量。 取值范圍:1~255。 默認值:1,表示只有一個工作線程。 |
|
rds_duckdb.memory_limit | AP加速時使用的內存限制。 單位:MB(配置參數時無需添加單位)。 取值范圍:1~INT32_MAX。 默認值:100,表示上限為100 MB。 |
|
DuckDB的參數請參見DuckDB。
關閉AP加速
SET rds_duckdb.execution = off;
使用示例
rds_duckdb性能測試
本文以Linux環境為例,使用標準TPC-H測試評估rds_duckdb對復雜查詢的性能提升情況。
創建ECS實例并構建測試數據。
通過PostgreSQL官方網站下載并安裝PostgreSQL。
下載dbgen工具。
wget https://github.com/electrum/tpch-dbgen/archive/refs/heads/master.zip yum install -y unzip zip unzip master.zip cd tpch-dbgen-master/ echo "#define EOL_HANDLING 1" >> config.h # 消除生成數據末尾的'|' make ./dbgen --help
生成測試數據。
本文以測試數據存儲路徑為
/data/test
,測試數據為100 GB為例。您可根據自身需求選擇數據路徑和適當的數據集大小。./dbgen -s 100 mkdir /data/test/tpch_data mv *.tbl /data/test/tpch_data
說明數據量的大小對查詢速度具有直接影響。TPC-H中使用SF來描述數據量,其中1 SF對應1 GB。以此類推,100 SF即為100 GB。需要注意的是,1 SF對應的數據量僅包括8個表的總數據量,不包含索引等其他空間占用。因此,在準備數據時,應預留更多的存儲空間。
按照前提條件創建RDS PostgreSQL數據庫,并導入相關測試數據。
執行如下語句,創建8張TPC-H測試表。
CREATE TABLE customer(c_custkey BIGINT NOT NULL, c_name VARCHAR NOT NULL, c_address VARCHAR NOT NULL, c_nationkey INTEGER NOT NULL, c_phone VARCHAR NOT NULL, c_acctbal DECIMAL(15,2) NOT NULL, c_mktsegment VARCHAR NOT NULL, c_comment VARCHAR NOT NULL); CREATE TABLE lineitem(l_orderkey BIGINT NOT NULL, l_partkey BIGINT NOT NULL, l_suppkey BIGINT NOT NULL, l_linenumber BIGINT NOT NULL, l_quantity DECIMAL(15,2) NOT NULL, l_extendedprice DECIMAL(15,2) NOT NULL, l_discount DECIMAL(15,2) NOT NULL, l_tax DECIMAL(15,2) NOT NULL, l_returnflag VARCHAR NOT NULL, l_linestatus VARCHAR NOT NULL, l_shipdate DATE NOT NULL, l_commitdate DATE NOT NULL, l_receiptdate DATE NOT NULL, l_shipinstruct VARCHAR NOT NULL, l_shipmode VARCHAR NOT NULL, l_comment VARCHAR NOT NULL); CREATE TABLE nation(n_nationkey INTEGER NOT NULL, n_name VARCHAR NOT NULL, n_regionkey INTEGER NOT NULL, n_comment VARCHAR NOT NULL); CREATE TABLE orders(o_orderkey BIGINT NOT NULL, o_custkey BIGINT NOT NULL, o_orderstatus VARCHAR NOT NULL, o_totalprice DECIMAL(15,2) NOT NULL, o_orderdate DATE NOT NULL, o_orderpriority VARCHAR NOT NULL, o_clerk VARCHAR NOT NULL, o_shippriority INTEGER NOT NULL, o_comment VARCHAR NOT NULL); CREATE TABLE part(p_partkey BIGINT NOT NULL, p_name VARCHAR NOT NULL, p_mfgr VARCHAR NOT NULL, p_brand VARCHAR NOT NULL, p_type VARCHAR NOT NULL, p_size INTEGER NOT NULL, p_container VARCHAR NOT NULL, p_retailprice DECIMAL(15,2) NOT NULL, p_comment VARCHAR NOT NULL); CREATE TABLE partsupp(ps_partkey BIGINT NOT NULL, ps_suppkey BIGINT NOT NULL, ps_availqty BIGINT NOT NULL, ps_supplycost DECIMAL(15,2) NOT NULL, ps_comment VARCHAR NOT NULL); CREATE TABLE region(r_regionkey INTEGER NOT NULL, r_name VARCHAR NOT NULL, r_comment VARCHAR NOT NULL); CREATE TABLE supplier(s_suppkey BIGINT NOT NULL, s_name VARCHAR NOT NULL, s_address VARCHAR NOT NULL, s_nationkey INTEGER NOT NULL, s_phone VARCHAR NOT NULL, s_acctbal DECIMAL(15,2) NOT NULL, s_comment VARCHAR NOT NULL); \dt # 查看導入的數據表
導入已生成的測試數據。
COPY customer FROM '/data/test/tpch_data/customer.tbl' DELIMITER '|'; COPY lineitem FROM '/data/test/tpch_data/lineitem.tbl' DELIMITER '|'; COPY nation FROM '/data/test/tpch_data/nation.tbl' DELIMITER '|'; COPY orders FROM '/data/test/tpch_data/orders.tbl' DELIMITER '|'; COPY part FROM '/data/test/tpch_data/part.tbl' DELIMITER '|'; COPY partsupp FROM '/data/test/tpch_data/partsupp.tbl' DELIMITER '|'; COPY region FROM '/data/test/tpch_data/region.tbl' DELIMITER '|'; COPY supplier FROM '/data/test/tpch_data/supplier.tbl' DELIMITER '|'; \dt+ # 查看導入的數據表的詳細信息
安裝rds_duckdb插件并生成列存表。
安裝rds_duckdb插件。
CREATE EXTENSION rds_duckdb;
生成列存表。
# 將PG本地表轉換生成對應的列存表 SELECT rds_duckdb.create_duckdb_table('customer'); SELECT rds_duckdb.create_duckdb_table('lineitem'); SELECT rds_duckdb.create_duckdb_table('nation'); SELECT rds_duckdb.create_duckdb_table('orders'); SELECT rds_duckdb.create_duckdb_table('part'); SELECT rds_duckdb.create_duckdb_table('partsupp'); SELECT rds_duckdb.create_duckdb_table('region'); SELECT rds_duckdb.create_duckdb_table('supplier');
設置AP加速參數。
# 根據您自己的測試機情況設置線程數和內存限制(單位為MB) SET rds_duckdb.worker_threads = 32; SET rds_duckdb.memory_limit = 8192;
開啟AP加速。
SET rds_duckdb.execution = on; # 開啟計時 \timing on # 將結果重定向到文件中 \o /data/test/tpch_data/tpch_out
執行如下22條TPC-H標準測試SQL,并測試查詢性能。
-- Q1 SELECT l_returnflag, l_linestatus, sum(l_quantity) AS sum_qty, sum(l_extendedprice) AS sum_base_price, sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge, avg(l_quantity) AS avg_qty, avg(l_extendedprice) AS avg_price, avg(l_discount) AS avg_disc, count(*) AS count_order FROM lineitem WHERE l_shipdate <= CAST('1998-09-02' AS date) GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus; -- Q2 SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment FROM part, supplier, partsupp, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND p_size = 15 AND p_type LIKE '%BRASS' AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE' AND ps_supplycost = ( SELECT min(ps_supplycost) FROM partsupp, supplier, nation, region WHERE p_partkey = ps_partkey AND s_suppkey = ps_suppkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'EUROPE') ORDER BY s_acctbal DESC, n_name, s_name, p_partkey LIMIT 100; -- Q3 SELECT l_orderkey, sum(l_extendedprice * (1 - l_discount)) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < CAST('1995-03-15' AS date) AND l_shipdate > CAST('1995-03-15' AS date) GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate LIMIT 10; -- Q4 SELECT o_orderpriority, count(*) AS order_count FROM orders WHERE o_orderdate >= CAST('1993-07-01' AS date) AND o_orderdate < CAST('1993-10-01' AS date) AND EXISTS ( SELECT * FROM lineitem WHERE l_orderkey = o_orderkey AND l_commitdate < l_receiptdate) GROUP BY o_orderpriority ORDER BY o_orderpriority; -- Q5 SELECT n_name, sum(l_extendedprice * (1 - l_discount)) AS revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate >= CAST('1994-01-01' AS date) AND o_orderdate < CAST('1995-01-01' AS date) GROUP BY n_name ORDER BY revenue DESC; -- Q6 SELECT sum(l_extendedprice * l_discount) AS revenue FROM lineitem WHERE l_shipdate >= CAST('1994-01-01' AS date) AND l_shipdate < CAST('1995-01-01' AS date) AND l_discount BETWEEN 0.05 AND 0.07 AND l_quantity < 24; -- Q7 SELECT supp_nation, cust_nation, l_year, sum(volume) AS revenue FROM ( SELECT n1.n_name AS supp_nation, n2.n_name AS cust_nation, extract(year FROM l_shipdate) AS l_year, l_extendedprice * (1 - l_discount) AS volume FROM supplier, lineitem, orders, customer, nation n1, nation n2 WHERE s_suppkey = l_suppkey AND o_orderkey = l_orderkey AND c_custkey = o_custkey AND s_nationkey = n1.n_nationkey AND c_nationkey = n2.n_nationkey AND ((n1.n_name = 'FRANCE' AND n2.n_name = 'GERMANY') OR (n1.n_name = 'GERMANY' AND n2.n_name = 'FRANCE')) AND l_shipdate BETWEEN CAST('1995-01-01' AS date) AND CAST('1996-12-31' AS date)) AS shipping GROUP BY supp_nation, cust_nation, l_year ORDER BY supp_nation, cust_nation, l_year; -- Q8 SELECT o_year, sum( CASE WHEN nation = 'BRAZIL' THEN volume ELSE 0 END) / sum(volume) AS mkt_share FROM ( SELECT extract(year FROM o_orderdate) AS o_year, l_extendedprice * (1 - l_discount) AS volume, n2.n_name AS nation FROM part, supplier, lineitem, orders, customer, nation n1, nation n2, region WHERE p_partkey = l_partkey AND s_suppkey = l_suppkey AND l_orderkey = o_orderkey AND o_custkey = c_custkey AND c_nationkey = n1.n_nationkey AND n1.n_regionkey = r_regionkey AND r_name = 'AMERICA' AND s_nationkey = n2.n_nationkey AND o_orderdate BETWEEN CAST('1995-01-01' AS date) AND CAST('1996-12-31' AS date) AND p_type = 'ECONOMY ANODIZED STEEL') AS all_nations GROUP BY o_year ORDER BY o_year; -- Q9 SELECT nation, o_year, sum(amount) AS sum_profit FROM ( SELECT n_name AS nation, extract(year FROM o_orderdate) AS o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity AS amount FROM part, supplier, lineitem, partsupp, orders, nation WHERE s_suppkey = l_suppkey AND ps_suppkey = l_suppkey AND ps_partkey = l_partkey AND p_partkey = l_partkey AND o_orderkey = l_orderkey AND s_nationkey = n_nationkey AND p_name LIKE '%green%') AS profit GROUP BY nation, o_year ORDER BY nation, o_year DESC; -- Q10 SELECT c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) AS revenue, c_acctbal, n_name, c_address, c_phone, c_comment FROM customer, orders, lineitem, nation WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate >= CAST('1993-10-01' AS date) AND o_orderdate < CAST('1994-01-01' AS date) AND l_returnflag = 'R' AND c_nationkey = n_nationkey GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment ORDER BY revenue DESC LIMIT 20; -- Q11 SELECT ps_partkey, sum(ps_supplycost * ps_availqty) AS value FROM partsupp, supplier, nation WHERE ps_suppkey = s_suppkey AND s_nationkey = n_nationkey AND n_name = 'GERMANY' GROUP BY ps_partkey HAVING sum(ps_supplycost * ps_availqty) > ( SELECT sum(ps_supplycost * ps_availqty) * 0.0001000000 FROM partsupp, supplier, nation WHERE ps_suppkey = s_suppkey AND s_nationkey = n_nationkey AND n_name = 'GERMANY') ORDER BY value DESC; -- Q12 SELECT l_shipmode, sum( CASE WHEN o_orderpriority = '1-URGENT' OR o_orderpriority = '2-HIGH' THEN 1 ELSE 0 END) AS high_line_count, sum( CASE WHEN o_orderpriority <> '1-URGENT' AND o_orderpriority <> '2-HIGH' THEN 1 ELSE 0 END) AS low_line_count FROM orders, lineitem WHERE o_orderkey = l_orderkey AND l_shipmode IN ('MAIL', 'SHIP') AND l_commitdate < l_receiptdate AND l_shipdate < l_commitdate AND l_receiptdate >= CAST('1994-01-01' AS date) AND l_receiptdate < CAST('1995-01-01' AS date) GROUP BY l_shipmode ORDER BY l_shipmode; -- Q13 SELECT c_count, count(*) AS custdist FROM ( SELECT c_custkey, count(o_orderkey) FROM customer LEFT OUTER JOIN orders ON c_custkey = o_custkey AND o_comment NOT LIKE '%special%requests%' GROUP BY c_custkey) AS c_orders (c_custkey, c_count) GROUP BY c_count ORDER BY custdist DESC, c_count DESC; -- Q14 SELECT 100.00 * sum( CASE WHEN p_type LIKE 'PROMO%' THEN l_extendedprice * (1 - l_discount) ELSE 0 END) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue FROM lineitem, part WHERE l_partkey = p_partkey AND l_shipdate >= date '1995-09-01' AND l_shipdate < CAST('1995-10-01' AS date); -- Q15 SELECT s_suppkey, s_name, s_address, s_phone, total_revenue FROM supplier, ( SELECT l_suppkey AS supplier_no, sum(l_extendedprice * (1 - l_discount)) AS total_revenue FROM lineitem WHERE l_shipdate >= CAST('1996-01-01' AS date) AND l_shipdate < CAST('1996-04-01' AS date) GROUP BY supplier_no) revenue0 WHERE s_suppkey = supplier_no AND total_revenue = ( SELECT max(total_revenue) FROM ( SELECT l_suppkey AS supplier_no, sum(l_extendedprice * (1 - l_discount)) AS total_revenue FROM lineitem WHERE l_shipdate >= CAST('1996-01-01' AS date) AND l_shipdate < CAST('1996-04-01' AS date) GROUP BY supplier_no) revenue1) ORDER BY s_suppkey; -- Q16 SELECT p_brand, p_type, p_size, count(DISTINCT ps_suppkey) AS supplier_cnt FROM partsupp, part WHERE p_partkey = ps_partkey AND p_brand <> 'Brand#45' AND p_type NOT LIKE 'MEDIUM POLISHED%' AND p_size IN (49, 14, 23, 45, 19, 3, 36, 9) AND ps_suppkey NOT IN ( SELECT s_suppkey FROM supplier WHERE s_comment LIKE '%Customer%Complaints%') GROUP BY p_brand, p_type, p_size ORDER BY supplier_cnt DESC, p_brand, p_type, p_size; -- Q17 SELECT sum(l_extendedprice) / 7.0 AS avg_yearly FROM lineitem, part WHERE p_partkey = l_partkey AND p_brand = 'Brand#23' AND p_container = 'MED BOX' AND l_quantity < ( SELECT 0.2 * avg(l_quantity) FROM lineitem WHERE l_partkey = p_partkey); -- Q18 SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) FROM customer, orders, lineitem WHERE o_orderkey IN ( SELECT l_orderkey FROM lineitem GROUP BY l_orderkey HAVING sum(l_quantity) > 300) AND c_custkey = o_custkey AND o_orderkey = l_orderkey GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice ORDER BY o_totalprice DESC, o_orderdate LIMIT 100; -- Q19 SELECT sum(l_extendedprice * (1 - l_discount)) AS revenue FROM lineitem, part WHERE (p_partkey = l_partkey AND p_brand = 'Brand#12' AND p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG') AND l_quantity >= 1 AND l_quantity <= 1 + 10 AND p_size BETWEEN 1 AND 5 AND l_shipmode IN ('AIR', 'AIR REG') AND l_shipinstruct = 'DELIVER IN PERSON') OR (p_partkey = l_partkey AND p_brand = 'Brand#23' AND p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') AND l_quantity >= 10 AND l_quantity <= 10 + 10 AND p_size BETWEEN 1 AND 10 AND l_shipmode IN ('AIR', 'AIR REG') AND l_shipinstruct = 'DELIVER IN PERSON') OR (p_partkey = l_partkey AND p_brand = 'Brand#34' AND p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG') AND l_quantity >= 20 AND l_quantity <= 20 + 10 AND p_size BETWEEN 1 AND 15 AND l_shipmode IN ('AIR', 'AIR REG') AND l_shipinstruct = 'DELIVER IN PERSON'); -- Q20 SELECT s_name, s_address FROM supplier, nation WHERE s_suppkey IN ( SELECT ps_suppkey FROM partsupp WHERE ps_partkey IN ( SELECT p_partkey FROM part WHERE p_name LIKE 'forest%') AND ps_availqty > ( SELECT 0.5 * sum(l_quantity) FROM lineitem WHERE l_partkey = ps_partkey AND l_suppkey = ps_suppkey AND l_shipdate >= CAST('1994-01-01' AS date) AND l_shipdate < CAST('1995-01-01' AS date))) AND s_nationkey = n_nationkey AND n_name = 'CANADA' ORDER BY s_name; -- Q21 SELECT s_name, count(*) AS numwait FROM supplier, lineitem l1, orders, nation WHERE s_suppkey = l1.l_suppkey AND o_orderkey = l1.l_orderkey AND o_orderstatus = 'F' AND l1.l_receiptdate > l1.l_commitdate AND EXISTS ( SELECT * FROM lineitem l2 WHERE l2.l_orderkey = l1.l_orderkey AND l2.l_suppkey <> l1.l_suppkey) AND NOT EXISTS ( SELECT * FROM lineitem l3 WHERE l3.l_orderkey = l1.l_orderkey AND l3.l_suppkey <> l1.l_suppkey AND l3.l_receiptdate > l3.l_commitdate) AND s_nationkey = n_nationkey AND n_name = 'SAUDI ARABIA' GROUP BY s_name ORDER BY numwait DESC, s_name LIMIT 100; -- Q22 SELECT cntrycode, count(*) AS numcust, sum(c_acctbal) AS totacctbal FROM ( SELECT substring(c_phone FROM 1 FOR 2) AS cntrycode, c_acctbal FROM customer WHERE substring(c_phone FROM 1 FOR 2) IN ('13', '31', '23', '29', '30', '18', '17') AND c_acctbal > ( SELECT avg(c_acctbal) FROM customer WHERE c_acctbal > 0.00 AND substring(c_phone FROM 1 FOR 2) IN ('13', '31', '23', '29', '30', '18', '17')) AND NOT EXISTS ( SELECT * FROM orders WHERE o_custkey = c_custkey)) AS custsale GROUP BY cntrycode ORDER BY cntrycode;
查看SQL執行計劃
使用EXPLAIN
語句查看開啟和關閉AP加速后,SQL語句的執行計劃。例如:
開啟AP加速后,SQL語句的執行計劃如下。
tpch_10x=# SET rds_duckdb.execution = on; SET tpch_10x=# EXPLAIN SELECT tpch_10x-# 100.00 * sum( tpch_10x(# CASE WHEN p_type LIKE 'PROMO%' THEN tpch_10x(# l_extendedprice * (1 - l_discount) tpch_10x(# ELSE tpch_10x(# 0 tpch_10x(# END) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue tpch_10x-# FROM tpch_10x-# lineitem, tpch_10x-# part tpch_10x-# WHERE tpch_10x-# l_partkey = p_partkey tpch_10x-# AND l_shipdate >= date '1995-09-01' tpch_10x-# AND l_shipdate < CAST('1995-10-01' AS date); QUERY PLAN ------------------------------------------------------------ Custom Scan (DuckDBNode) (cost=0.00..0.00 rows=0 width=0) DuckDB Plan: ┌───────────────────────────┐ │ PROJECTION │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Projections: │ │ promo_revenue │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Estimated Cardinality: │ │ 1 │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ UNGROUPED_AGGREGATE │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Aggregates: │ │ sum(#0) │ │ sum(#1) │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ PROJECTION │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Projections: │ │ CASE WHEN (prefix(p_type,│ │ 'PROMO')) THEN (CAST( │ │ (l_extendedprice * (1.000 │ │ - CAST(l_discount AS │ │ DECIMAL(18,3)))) AS │ │ DECIMAL(20,5))) ELSE 0 │ │ .00000 END │ │ (l_extendedprice * (1.000 │ │ - CAST(l_discount AS │ │ DECIMAL(18,3)))) │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Estimated Cardinality: │ │ 6600339 │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ HASH_JOIN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Join Type: │ │ INNER │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Conditions: ├──────────────┐ │ l_partkey = p_partkey │ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ Estimated Cardinality: │ │ │ 6600339 │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ │ SEQ_SCAN ││ SEQ_SCAN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Stringified: ││ Stringified: │ │ lineitem ││ part │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ Projections: ││ Projections: │ │ l_partkey ││ p_partkey │ │ l_extendedprice ││ p_type │ │ l_discount ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ Estimated Cardinality: │ │ Filters: ││ 2000000 │ │ l_shipdate>='1995-09-01': ││ │ │ :DATE AND l_shipdate<'1995││ │ │ -10-01'::DATE AND ││ │ │ l_shipdate IS NOT NULL ││ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │ │ Estimated Cardinality: ││ │ │ 11997210 ││ │ └───────────────────────────┘└───────────────────────────┘ (71 rows)
關閉AP加速后,SQL語句的執行計劃如下。
tpch_10x=# SET rds_duckdb.execution = off; SET tpch_10x=# EXPLAIN SELECT 100.00 * sum( CASE WHEN p_type LIKE 'PROMO%' THEN l_extendedprice * (1 - l_discount) ELSE 0 END) / sum(l_extendedprice * (1 - l_discount)) AS promo_revenue FROM lineitem, part WHERE l_partkey = p_partkey AND l_shipdate >= date '1995-09-01' AND l_shipdate < CAST('1995-10-01' AS date); QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Finalize Aggregate (cost=1286740.42..1286740.43 rows=1 width=32) -> Gather (cost=1286739.96..1286740.37 rows=4 width=64) Workers Planned: 4 -> Partial Aggregate (cost=1285739.96..1285739.97 rows=1 width=64) -> Parallel Hash Join (cost=1235166.04..1282419.39 rows=189747 width=33) Hash Cond: (part.p_partkey = lineitem.l_partkey) -> Parallel Seq Scan on part (cost=0.00..43232.15 rows=500016 width=29) -> Parallel Hash (cost=1233776.40..1233776.40 rows=111171 width=20) -> Parallel Seq Scan on lineitem (cost=0.00..1233776.40 rows=111171 width=20) Filter: ((l_shipdate >= '1995-09-01'::date) AND (l_shipdate < '1995-10-01'::date)) JIT: Functions: 17 Options: Inlining true, Optimization true, Expressions true, Deforming true (13 rows)