Re: A problem about partitionwise join

From: David Steele <david(at)pgmasters(dot)net>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Richard Guo <riguo(at)pivotal(dot)io>, ashutosh(dot)bapat(at)enterprisedb(dot)com
Subject: Re: A problem about partitionwise join
Date: 2021-03-09 16:22:32
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/27/20 7:05 AM, Ashutosh Bapat wrote:
> On Tue, Nov 10, 2020 at 2:43 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
>> To recap, the problem we are fixing here is when generating join clauses
>> from equivalence classes, we only select the joinclause with the 'best
>> score', or the first joinclause with a score of 3. This may cause us to
>> miss some joinclause on partition keys and thus fail to generate
>> partitionwise join.
>> The initial idea for the fix is to create all the RestrictInfos from ECs
>> in order to check whether there exist equi-join conditions involving
>> pairs of matching partition keys of the relations being joined for all
>> partition keys. And then Tom proposed a much better idea which leverages
>> function exprs_known_equal() to tell whether the partkeys can be found
>> in the same eclass, which is the current implementation in the latest
>> patch.
> In the example you gave earlier, the equi join on partition key was
> there but it was replaced by individual constant assignment clauses.
> So if we keep the original restrictclause in there with a new flag
> indicating that it's redundant, have_partkey_equi_join will still be
> able to use it without much change. Depending upon where all we need
> to use avoid restrictclauses with the redundant flag, this might be an
> easier approach. However, with Tom's idea partition-wise join may be
> used even when there is no equi-join between partition keys but there
> are clauses like pk = const for all tables involved and const is the
> same for all such tables.
> In the spirit of small improvement made to the performance of
> have_partkey_equi_join(), pk_has_clause should be renamed as
> pk_known_equal and pks_known_equal as num_equal_pks.
> The loop traversing the partition keys at a given position, may be
> optimized further if we pass lists to exprs_known_equal() which in
> turns checks whether one expression from each list is member of a
> given EC. This will avoid traversing all equivalence classes for each
> partition key expression, which can be a huge improvement when there
> are many ECs. But I think if one of the partition key expression at a
> given position is member of an equivalence class all the other
> partition key expressions at that position should be part of that
> equivalence class since there should be an equi-join between those. So
> the loop in loop may not be required to start with.

Richard, any thoughts on Ashutosh's comments?


In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-03-09 16:23:42 Re: partial heap only tuples
Previous Message Mark Dilger 2021-03-09 16:21:44 Re: Lowering the ever-growing heap->pd_lower