用例 5 — 子查询或 CTE - AWS 规范性指导

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

用例 5 — 子查询或 CTE

公用表表达式 (CTE) 有助于将大型查询分解为较小的查询。这使得整个查询更易于维护。

子查询联接被 CTE 联接所取代,后者更具可读性,因为查询是在 CTE 部分中命名和分隔的。当查询大小增加并且查询变得更难维护时,这特别有用。此外,还实现了 PostgreSQL 中的 CTE 结果。如果您在多个地方调用 CTE,则实际的查询定义将只运行一次。结果将存储在内存中。您可以将其用于任何必须在同一个查询中的多个位置使用的复杂逻辑。将该逻辑放入 CTE 中,然后多次调用 CTE。

例如,某客户正在使用内联应用程序查询,查询中包含许多子查询。子查询按应用程序发送的输入参数值进行过滤。

EXPLAIN ANALYZE SELECT * FROM ORDER_DETAILS A WHERE A.ORDID IN (SELECT ORDID FROM PAYMENT_DETAILS) AND A.ORDID IN (SELECT ORDID FROM ITEM_DETAILS ) AND A.ORDID = 1000000;
"Nested Loop Semi Join (cost=3000.00..194258.21 rows=5 width=74) (actual time=201.605..747.945 rows=5 loops=1)" " -> Nested Loop Semi Join (cost=2000.00..135040.47 rows=5 width=74) (actual time=146.016..666.779 rows=5 loops=1)" " -> Gather (cost=1000.00..78580.31 rows=5 width=74) (actual time=58.893..463.570 rows=5 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on order_details a (cost=0.00..77579.81 rows=2 width=74) (actual time=165.627..549.702 rows=2 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1666665" " -> Materialize (cost=1000.00..56460.07 rows=3 width=4) (actual time=17.424..40.638 rows=1 loops=5)" " -> Gather (cost=1000.00..56460.06 rows=3 width=4) (actual time=87.113..203.178 rows=1 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on payment_details (cost=0.00..55459.76 rows=1 width=4) (actual time=174.431..423.792 rows=1 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1333002" " -> Materialize (cost=1000.00..59217.64 rows=4 width=4) (actual time=11.117..16.231 rows=1 loops=5)" " -> Gather (cost=1000.00..59217.62 rows=4 width=4) (actual time=55.581..81.148 rows=1 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on item_details (cost=0.00..58217.22 rows=2 width=4) (actual time=287.030..411.004 rows=1 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1333080" "Planning Time: 0.266 ms" "Execution Time: 747.986 ms"

在使用 CTE 修改子查询并添加筛选器以便仅检索必需的行集之后,查询性能得到改善。

EXPLAIN ANALYZE WITH PAYMENT AS ( SELECT * FROM PAYMENT_DETAILS WHERE ORDID = 1000000 ), ITEM AS (SELECT * FROM ITEM_DETAILS WHERE ORDID = 1000000) SELECT * FROM ORDER_DETAILS A JOIN PAYMENT B ON A.ORDID=B.ORDID JOIN ITEM C ON B.ORDID=C.ORDID
"Nested Loop (cost=3000.00..194258.91 rows=60 width=166) (actual time=586.410..732.918 rows=80 loops=1)" " -> Nested Loop (cost=2000.00..115677.83 rows=12 width=92) (actual time=456.760..457.083 rows=16 loops=1)" " -> Gather (cost=1000.00..59217.62 rows=4 width=48) (actual time=153.802..154.060 rows=4 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on item_details (cost=0.00..58217.22 rows=2 width=48) (actual time=85.417..249.045 rows=1 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1333332" " -> Materialize (cost=1000.00..56460.07 rows=3 width=44) (actual time=75.738..75.753 rows=4 loops=4)" " -> Gather (cost=1000.00..56460.06 rows=3 width=44) (actual time=302.947..303.005 rows=4 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on payment_details (cost=0.00..55459.76 rows=1 width=44) (actual time=184.609..294.784 rows=1 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1333332" " -> Materialize (cost=1000.00..78580.34 rows=5 width=74) (actual time=8.103..17.238 rows=5 loops=16)" " -> Gather (cost=1000.00..78580.31 rows=5 width=74) (actual time=129.641..275.795 rows=5 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on order_details a (cost=0.00..77579.81 rows=2 width=74) (actual time=78.556..268.994 rows=2 loops=3)" " Filter: (ordid = 1000000)" " Rows Removed by Filter: 1666665" "Planning Time: 0.108 ms" "Execution Time: 732.953 ms"

这些是从示例数据中得出的观测结果。当你在庞大的数据集上运行查询时,性能差异会很大。