Re: A problem about partitionwise join

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Richard Guo <riguo(at)pivotal(dot)io>, ashutosh(dot)bapat(at)enterprisedb(dot)com
Subject: Re: A problem about partitionwise join
Date: 2021-07-21 08:44:53
Message-ID: CAMbWs48C0ux9mc+AS7D7DP6CyJxjQS=2mxQTpMJJKEeTua5wbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 27, 2020 at 8:05 PM Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
wrote:

> On Tue, Nov 10, 2020 at 2:43 PM Richard Guo <guofenglinux(at)gmail(dot)com>
> wrote:
> > Thanks Anastasia. I've rebased the patch with latest master.
> >
> > To recap, the problem we are fixing here is when generating join clauses
> > from equivalence classes, we only select the joinclause with the 'best
> > score', or the first joinclause with a score of 3. This may cause us to
> > miss some joinclause on partition keys and thus fail to generate
> > partitionwise join.
> >
> > The initial idea for the fix is to create all the RestrictInfos from ECs
> > in order to check whether there exist equi-join conditions involving
> > pairs of matching partition keys of the relations being joined for all
> > partition keys. And then Tom proposed a much better idea which leverages
> > function exprs_known_equal() to tell whether the partkeys can be found
> > in the same eclass, which is the current implementation in the latest
> > patch.
> >
>
> In the example you gave earlier, the equi join on partition key was
> there but it was replaced by individual constant assignment clauses.
> So if we keep the original restrictclause in there with a new flag
> indicating that it's redundant, have_partkey_equi_join will still be
> able to use it without much change. Depending upon where all we need
> to use avoid restrictclauses with the redundant flag, this might be an
> easier approach. However, with Tom's idea partition-wise join may be
> used even when there is no equi-join between partition keys but there
> are clauses like pk = const for all tables involved and const is the
> same for all such tables.
>

Correct. So with Tom's idea partition-wise join can cope with clauses
such as 'foo.k1 = bar.k1 and foo.k2 = 16 and bar.k2 = 16'.

>
> In the spirit of small improvement made to the performance of
> have_partkey_equi_join(), pk_has_clause should be renamed as
> pk_known_equal and pks_known_equal as num_equal_pks.
>

Thanks for the suggestion. Will do that in the new version of patch.

>
> The loop traversing the partition keys at a given position, may be
> optimized further if we pass lists to exprs_known_equal() which in
> turns checks whether one expression from each list is member of a
> given EC. This will avoid traversing all equivalence classes for each
> partition key expression, which can be a huge improvement when there
> are many ECs. But I think if one of the partition key expression at a
> given position is member of an equivalence class all the other
> partition key expressions at that position should be part of that
> equivalence class since there should be an equi-join between those. So
> the loop in loop may not be required to start with.
>

Good point. Quote from one of Tom's earlier emails,
"It seems at least plausible that in the cases we care about, all the
partkeys on each side would be in the same eclasses anyway, so that
comparing the first members of each list would be sufficient."

But I'm not sure if this holds true in all cases. However, since each
base relation within the join contributes only one partexpr, the number
of partexprs would only be equal to the join degree. Thus the loop in
loop may not be a big problem?

PS. Sorry for delaying so long time!

Thanks
Richard

Attachment Content-Type Size
v5-0001-Fix-up-partitionwise-join.patch application/octet-stream 14.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ronan Dunklau 2021-07-21 09:05:14 Re: ORDER BY pushdowns seem broken in postgres_fdw
Previous Message Dean Rasheed 2021-07-21 08:23:18 Re: Have I found an interval arithmetic bug?