Re: A problem about partitionwise join

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Richard Guo <riguo(at)pivotal(dot)io>, ashutosh(dot)bapat(at)enterprisedb(dot)com
Subject: Re: A problem about partitionwise join
Date: 2020-11-27 12:05:00
Message-ID: CAExHW5vx89LbqphcusNR3AJ+kihWXDo0gv7eJVu6+nsafwow-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 10, 2020 at 2:43 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
>
>
> On Fri, Nov 6, 2020 at 11:26 PM Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>>
>> Status update for a commitfest entry.
>>
>> According to CFbot this patch fails to apply. Richard, can you send an update, please?
>>
>> Also, I see that the thread was inactive for a while.
>> Are you going to continue this work? I think it would be helpful, if you could write a short recap about current state of the patch and list open questions for reviewers.
>>
>> The new status of this patch is: Waiting on Author
>
>
> Thanks Anastasia. I've rebased the patch with latest master.
>
> To recap, the problem we are fixing here is when generating join clauses
> from equivalence classes, we only select the joinclause with the 'best
> score', or the first joinclause with a score of 3. This may cause us to
> miss some joinclause on partition keys and thus fail to generate
> partitionwise join.
>
> The initial idea for the fix is to create all the RestrictInfos from ECs
> in order to check whether there exist equi-join conditions involving
> pairs of matching partition keys of the relations being joined for all
> partition keys. And then Tom proposed a much better idea which leverages
> function exprs_known_equal() to tell whether the partkeys can be found
> in the same eclass, which is the current implementation in the latest
> patch.
>

In the example you gave earlier, the equi join on partition key was
there but it was replaced by individual constant assignment clauses.
So if we keep the original restrictclause in there with a new flag
indicating that it's redundant, have_partkey_equi_join will still be
able to use it without much change. Depending upon where all we need
to use avoid restrictclauses with the redundant flag, this might be an
easier approach. However, with Tom's idea partition-wise join may be
used even when there is no equi-join between partition keys but there
are clauses like pk = const for all tables involved and const is the
same for all such tables.

In the spirit of small improvement made to the performance of
have_partkey_equi_join(), pk_has_clause should be renamed as
pk_known_equal and pks_known_equal as num_equal_pks.

The loop traversing the partition keys at a given position, may be
optimized further if we pass lists to exprs_known_equal() which in
turns checks whether one expression from each list is member of a
given EC. This will avoid traversing all equivalence classes for each
partition key expression, which can be a huge improvement when there
are many ECs. But I think if one of the partition key expression at a
given position is member of an equivalence class all the other
partition key expressions at that position should be part of that
equivalence class since there should be an equi-join between those. So
the loop in loop may not be required to start with.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-11-27 12:33:36 Re: Improper use about DatumGetInt32
Previous Message Fujii Masao 2020-11-27 11:21:12 Re: [patch] CLUSTER blocks scanned progress reporting