Re: d25ea01275 and partitionwise join

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Richard Guo <riguo(at)pivotal(dot)io>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: d25ea01275 and partitionwise join
Date: 2020-04-07 03:17:07
Message-ID: CA+HiwqEqwKXyjtTYKpU1ZTJe2N-aKpkK3WoAqLCnYkW8ySLOgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 7, 2020 at 2:41 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> > which does succeed in using partitionwise join. Please see attached
> > delta that applies on your v7 if that is what you'd rather have.
>
> I figured these queries were cheap enough that we could afford to run
> both. With that and some revision of the comments (per attached),
> I was feeling like we were ready to go.

Looks good to me.

> However, re-reading the thread,
> one of Richard's comments struck me as still relevant. If you try, say,
>
> create table p (k int, val int) partition by range(k);
> create table p_1 partition of p for values from (1) to (10);
> create table p_2 partition of p for values from (10) to (100);
>
> set enable_partitionwise_join = 1;
>
> explain
> select * from (p as t1 full join p as t2 on t1.k = t2.k) as t12(k1,val1,k2,val2)
> full join p as t3 on COALESCE(t12.k1, t12.k2) = t3.k;
>
> this patch will give you a partitioned join, with a different plan
> than you get without enable_partitionwise_join. This is scary,
> because it's not immediately obvious that the transformation is
> correct.
>
> I *think* that it might be all right, because although what we
> are matching to is a user-written COALESCE() not an actual
> FULL JOIN USING column, it has to behave in somewhat the same
> way. In particular, by construction it must be a coalesce of
> some representation of the matching partition columns of the
> full join's inputs. So, even though it might go to null in
> different cases than an actual USING variable would do, it
> does not break the ability to partition the join.

Seems fine to me too. Maybe users should avoid writing it by hand if
possible anyway, because even slight variation in the way it's written
will affect this:

set enable_partitionwise_join = 1;

-- order of coalesce() arguments reversed
explain (costs off)
select * from (p as t1 full join p as t2 on t1.k = t2.k) as t12(k1,val1,k2,val2)
full join p as t3 on COALESCE(t12.k2, t12.k1) = t3.k;
QUERY PLAN
----------------------------------------------
Hash Full Join
Hash Cond: (COALESCE(t2.k, t1.k) = t3.k)
-> Append
-> Hash Full Join
Hash Cond: (t1_1.k = t2_1.k)
-> Seq Scan on p_1 t1_1
-> Hash
-> Seq Scan on p_1 t2_1
-> Hash Full Join
Hash Cond: (t1_2.k = t2_2.k)
-> Seq Scan on p_2 t1_2
-> Hash
-> Seq Scan on p_2 t2_2
-> Hash
-> Append
-> Seq Scan on p_1 t3_1
-> Seq Scan on p_2 t3_2
(17 rows)

> However, I have not spent a whole lot of time thinking about
> partitionwise joins, so rather than go ahead and commit I am
> going to toss that point back out for community consideration.

Agreed.

> At the very least, what I'd written in the comment needs a
> lot more defense than it has now.

Sorry, which comment are you referring to?

--
Thank you,

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-04-07 03:22:20 Re: Don't try fetching future segment of a TLI.
Previous Message Fujii Masao 2020-04-07 03:15:00 Re: Don't try fetching future segment of a TLI.