Re: A problem about partitionwise join

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Richard Guo <riguo(at)pivotal(dot)io>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A problem about partitionwise join
Date: 2019-08-29 18:08:27
Message-ID: CAPmGK15szCqpXT1RPxtHq=XTK019mUCPQ4RCFAMsRoU4vsyd1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 29, 2019 at 6:45 PM Richard Guo <riguo(at)pivotal(dot)io> wrote:
> On Wed, Aug 28, 2019 at 6:49 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
>> On Tue, Aug 27, 2019 at 4:57 PM Richard Guo <riguo(at)pivotal(dot)io> wrote:
>> > Check the query below as a more illustrative example:
>> >
>> > create table p (k int, val int) partition by range(k);
>> > create table p_1 partition of p for values from (1) to (10);
>> > create table p_2 partition of p for values from (10) to (100);
>> >
>> > If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
>> > partitionwise join:
>> >
>> > # explain (costs off)
>> > select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val;
>> > QUERY PLAN
>> > -----------------------------------------
>> > Append
>> > -> Hash Join
>> > Hash Cond: (foo.k = bar.k)
>> > -> Seq Scan on p_1 foo
>> > -> Hash
>> > -> Seq Scan on p_1 bar
>> > Filter: (k = val)
>> > -> Hash Join
>> > Hash Cond: (foo_1.k = bar_1.k)
>> > -> Seq Scan on p_2 foo_1
>> > -> Hash
>> > -> Seq Scan on p_2 bar_1
>> > Filter: (k = val)
>> > (13 rows)
>> >
>> > But if we exchange the order of the two quals to 'foo.k = bar.val and
>> > foo.k = bar.k', then partitionwise join cannot be generated any more,
>> > because we only have joinclause 'foo.k = bar.val' as it first reached
>> > score of 3. We have missed the joinclause on the partition key although
>> > it does exist.
>> >
>> > # explain (costs off)
>> > select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;
>> > QUERY PLAN
>> > -----------------------------------------
>> > Hash Join
>> > Hash Cond: (foo.k = bar.val)
>> > -> Append
>> > -> Seq Scan on p_1 foo
>> > -> Seq Scan on p_2 foo_1
>> > -> Hash
>> > -> Append
>> > -> Seq Scan on p_1 bar
>> > Filter: (val = k)
>> > -> Seq Scan on p_2 bar_1
>> > Filter: (val = k)
>> > (11 rows)
>>
>> I think it would be nice if we can address this issue.

> Attached is a patch as an attempt to address this issue. The idea is
> quite straightforward. When building partition info for joinrel, we
> generate any possible EC-derived joinclauses of form 'outer_em =
> inner_em', which will be used together with the original restrictlist to
> check if there exists an equi-join condition for each pair of partition
> keys.

Thank you for the patch! Will review. Could you add the patch to the
upcoming CF so that it doesn’t get lost?

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-08-29 18:48:24 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Alan Kleiman 2019-08-29 15:03:22 Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events