From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Andy Fan <zhihuifan1213(at)163(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
Subject: | Re: Pathify RHS unique-ification for semijoin planning |
Date: | 2025-07-04 01:41:35 |
Message-ID: | CAMbWs49+V3m8ghSDUyUBEziXhBgfRZ8GCLu-kWZqGpiXW8i=Bw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 3, 2025 at 7:06 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> This patch does not apply again, so here is a new rebase.
>
> This version also fixes an issue related to parameterized paths: if
> the RHS has LATERAL references to the LHS, unique-ification becomes
> meaningless because the RHS depends on the LHS, and such paths should
> not be generated.
(The cc list is somehow lost; re-ccing.)
FWIW, I noticed that the row/cost estimates for the unique-ification
node on master can be very wrong. For example:
create table t(a int, b int);
insert into t select i%100, i from generate_series(1,10000)i;
vacuum analyze t;
set enable_hashagg to off;
explain (costs on)
select * from t t1, t t2 where (t1.a, t2.b) in
(select a, b from t t3 where t1.b is not null offset 0);
And look at the snippet from the plan:
(on master)
-> Unique (cost=934.39..1009.39 rows=10000 width=8)
-> Sort (cost=271.41..271.54 rows=50 width=8)
Sort Key: "ANY_subquery".a, "ANY_subquery".b
-> Subquery Scan on "ANY_subquery" (cost=0.00..270.00
rows=50 width=8)
The row estimate for the subpath is 50, but it increases to 10000
after unique-ification. How does that make sense?
This issue does not occur with this patch:
(on patched)
-> Unique (cost=271.41..271.79 rows=50 width=8)
-> Sort (cost=271.41..271.54 rows=50 width=8)
Sort Key: "ANY_subquery".a, "ANY_subquery".b
-> Subquery Scan on "ANY_subquery" (cost=0.00..270.00
rows=50 width=8)
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Rahila Syed | 2025-07-04 02:12:58 | Re: Improve error message for duplicate labels in enum types |
Previous Message | Andy Fan | 2025-07-04 00:26:55 | Re: parallel safety of correlated subquery |