Re: apply_scanjoin_target_to_paths and partitionwise join

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, arne(dot)roland(at)malkut(dot)net, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
Subject: Re: apply_scanjoin_target_to_paths and partitionwise join
Date: 2025-11-21 22:06:36
Message-ID: CA+TgmobPLS=xL_n01DjOkBCJCyucWTFOahdi5kH_uQ8Y+wVFTA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 31, 2025 at 2:40 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Thoughts?

I guess not.

I spent a bunch more time investigating what was going on with the one
problematic test case that I mentioned in my previous email. I
discovered that my prior analysis missed a key point about this test
case, which is that it involves a row-count inflating join. It uses
"t1.c = t2.c" and "t1.c = t3.c" as join clauses, but the tables have
15-25 rows and there are only 5 possible values for column c. So, rows
have multiple join partners. What this means is that the when we pick
a partitionwise plan, the total number of rows that we estimate will
need to be processed by an Append node is much greater. In the
non-partitionwise plan, we need to append each pair of partitions, so
the total number of rows that pass through Append across the whole
plan tree is estimated as 15+15+25=55. In the partitionwise plan, the
joins are performed first, increasing the row count because of the
multiple join partners, and then we estimate we'll need to Append
42+250=292 rows. Therefore, the planner is actually being reasonable
to choose the non-partitionwise plan: avoiding pushing a lot of extra
rows through an Append node is a legitimate choice. However, it
defeats the purpose of the test case. So, I just added added more join
clauses, joining on "t1.a = t2.a AND t1.c = t2.c" and then on "t1.a =
t3.a AND t1.c = t3.c". This preserves the original plan shape by
removing the row-count inflation from the plan, allowing the
partitionwise plan to win as before.

I'm pretty happy with the resulting patch, and plan to commit it (only
to master) if nobody has any complaints. Please let me know if you
have complaints.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v4-0001-Don-t-reset-the-pathlist-of-partitioned-joinrels.patch application/octet-stream 24.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2025-11-21 22:16:33 Re: meson and check-tests
Previous Message Jacob Champion 2025-11-21 22:01:19 Re: [PATCH] Reorganize pqcomm.h a bit