| From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Dian Fay <di(at)nmfay(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_plan_advice |
| Date: | 2025-12-17 10:12:40 |
| Message-ID: | CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Here's v7.
[..]
OK, so I've tested today from Your's branch directly, so I hope that
was also v7. Given the following q20 query:
SELECT s_name, s_address
FROM supplier, nation
WHERE s_suppkey in
(SELECT ps_suppkey
FROM partsupp
WHERE ps_partkey in
(SELECT p_partkey
FROM part
WHERE p_name LIKE 'forest%' )
AND ps_availqty >
(SELECT 0.5 * sum(l_quantity)
FROM lineitem
WHERE l_partkey = ps_partkey
AND l_suppkey = ps_suppkey
AND l_shipdate >= DATE '1994-01-01'
AND l_shipdate < DATE '1994-01-01' + INTERVAL '1' year ) )
AND s_nationkey = n_nationkey
AND n_name = 'CANADA'
ORDER BY s_name;
in normal conditions (w/o advice) the above query generates:
Sort (cost=1010985030.44..1010985030.59 rows=61 width=51)
Sort Key: supplier.s_name
-> Nested Loop (cost=0.42..1010985028.63 rows=61 width=51)
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
-> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4)
Filter: (n_name = 'CANADA'::bpchar)
-> Nested Loop Semi Join (cost=0.42..1010985008.29
rows=1522 width=55)
Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
-> Seq Scan on supplier (cost=0.00..249.30 rows=7730 width=59)
-> Materialize (cost=0.42..1010755994.57 rows=1973 width=4)
-> Nested Loop (cost=0.42..1010755984.71
rows=1973 width=4)
-> Seq Scan on part (cost=0.00..4842.25
rows=1469 width=4)
Filter: ((p_name)::text ~~ 'forest%'::text)
-> Index Scan using pk_partsupp on
partsupp (cost=0.42..688053.87 rows=1 width=8)
Index Cond: (ps_partkey = part.p_partkey)
Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
SubPlan expr_1
-> Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
-> Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))
Generated Plan Advice:
JOIN_ORDER(nation (supplier (part partsupp)))
NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
NESTED_LOOP_MATERIALIZE(partsupp)
SEQ_SCAN(nation supplier part lineitem(at)expr_1)
INDEX_SCAN(partsupp public.pk_partsupp)
SEMIJOIN_NON_UNIQUE((partsupp part))
NO_GATHER(supplier nation partsupp part lineitem(at)expr_1)
Please see the - I think it's confusing? -
NESTED_LOOP_MATERIALIZE(partsupp partsupp) - that's 2x the same
string? This causes it to turn into below plan -- I've marked the
problem with [X]
Sort (cost=50035755.50..50035755.66 rows=61 width=51)
Sort Key: supplier.s_name
-> Nested Loop (cost=12562154.32..50035753.70 rows=61 width=51)
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
-> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4)
Filter: (n_name = 'CANADA'::bpchar)
-> Nested Loop Semi Join (cost=12562154.32..50035733.36
rows=1522 width=55)
[X] -- missing Join Filter here
-> Seq Scan on supplier (cost=0.00..249.30 rows=7730 width=59)
[X] -- HJ instead of Materialize+Nested Loop below:
-> Hash Join (cost=12562154.32..12567002.09 rows=1 width=4)
Hash Cond: (part.p_partkey = partsupp.ps_partkey)
-> Seq Scan on part (cost=0.00..4842.25
rows=1469 width=4)
Filter: ((p_name)::text ~~ 'forest%'::text)
-> Hash (cost=12562154.02..12562154.02 rows=24 width=8)
-> Index Scan using pk_partsupp on
partsupp (cost=0.42..12562154.02 rows=24 width=8)
[X] -- wrong Index Cond below
(suppkey instead of partkey)
Index Cond: (ps_suppkey = supplier.s_suppkey)
Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
SubPlan expr_1
-> Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
-> Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))
Supplied Plan Advice:
SEQ_SCAN(nation) /* matched */
SEQ_SCAN(supplier) /* matched */
SEQ_SCAN(part) /* matched */
SEQ_SCAN(lineitem(at)expr_1) /* matched */
INDEX_SCAN(partsupp public.pk_partsupp) /* matched */
JOIN_ORDER(nation (supplier (part partsupp))) /* matched, conflicting */
NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
SEMIJOIN_NON_UNIQUE((partsupp part)) /* matched, conflicting */
NO_GATHER(supplier) /* matched */
NO_GATHER(nation) /* matched */
NO_GATHER(partsupp) /* matched */
NO_GATHER(part) /* matched */
NO_GATHER(lineitem(at)expr_1) /* matched */
So the difference is basically between:
set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(partsupp
partsupp) NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which causes wrong plan and outcome:
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
and apparently proper advice like below which has better yield:
set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)
NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which is not generated , but caused good plan, however it also prints:
NESTED_LOOP_PLAIN(part) /* matched, conflicting, failed */
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting */
but that seems "failed" there, seems to be untrue?
Another idea is perhaps, we could have some elog(WARNING) - but not
Asserts() - in assert-only enabled build that could alert us in case
of duplicated entries being detected for the same ops in
pg_plan_advice_explain_feedback()?
-J.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-12-17 10:19:58 | Re: Proposal: Conflict log history table for Logical Replication |
| Previous Message | vignesh C | 2025-12-17 10:07:09 | Re: [Proposal] Adding Log File Capability to pg_createsubscriber |