Re: Support run-time partition pruning for hash join

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support run-time partition pruning for hash join
Date: 2023-08-22 09:51:38
Message-ID: CAMbWs4_1sqDXMV_qWup_GknNpL7qL9PKTKA+oe_FqhtNzSE=WA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 22, 2023 at 2:38 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> With Hash Join, it seems to me that the pruning must take place for
> every row that makes it into the hash table. There will be maybe
> cases where the unioned set of partitions simply yields every
> partition and all the work results in no savings. Pruning on a scalar
> value seems much more likely to be able to prune away unneeded
> Append/MergeAppend subnodes.

Yeah, you're right. If we have 'pt HashJoin t', for a subnode of 'pt'
to be pruned, it needs every row of 't' to be able to prune that
subnode. The situation may improve if we have more than 2-way hash
joins, because the final surviving subnodes would be the intersection of
matching subnodes in each Hash.

With parameterized nestloop I agree that it's more likely to be able to
prune subnodes at rescan of Append/MergeAppend nodes based on scalar
values.

Sometimes we may just not generate parameterized nestloop as final plan,
such as when there are no indexes and no lateral references in the
Append/MergeAppend node. In this case I think it would be great if we
can still do some partition prunning. So I think this new 'join
partition prunning mechanism' (maybe this is not a proper name) should
be treated as a supplement to, not a substitute for, the current
run-time partition prunning based on parameterized nestloop, and it is
so implemented in the patch.

> Perhaps there can be something adaptive in Hash Join which stops
> trying to prune when all partitions must be visited. On a quick
> glance of the patch, I don't see any code in ExecJoinPartitionPrune()
> which gives up trying to prune when the number of members in
> part_prune_result is equal to the prunable Append/MergeAppend
> subnodes.

Yeah, we can do that.

> But run-time pruning already works for Nested Loops... I must be
> missing something here.

Here I mean nestloop with non-parameterized inner path. As I explained
upthread, we need to have a Material node on the outer side for that to
work, which seems not possible in real world.

Thanks
Richard

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-08-22 10:12:40 Re: persist logical slots to disk during shutdown checkpoint
Previous Message Peter Eisentraut 2023-08-22 09:49:33 Re: add timing information to pg_upgrade