From: | "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru> |
---|---|
To: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kohei KaiGai <kaigai(at)heterodb(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Asymmetric partition-wise JOIN |
Date: | 2021-04-30 03:10:19 |
Message-ID: | de20c214-1b26-b94a-081b-5b9aad3f490c@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/30/20 7:43 PM, Anastasia Lubennikova wrote:
> This entry was inactive during this CF, so I've marked it as returned
> with feedback. Feel free to resubmit an updated version to a future
> commitfest.
I return the patch to commitfest. My current reason differs from reason
of origin author.
This patch can open a door for more complex optimizations in the
partitionwise join push-down technique.
I mean, we can push-down join not only of two partitioned tables with
the same partition schema, but a partitioned (sharded) table with an
arbitrary subplan that is provable independent of local resources.
Example:
CREATE TABLE p(a int) PARTITION BY HASH (a);
CREATE TABLE p1 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 0);
CREATE TABLE p2 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE p3 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 2);
SELECT * FROM p, (SELECT * FROM generate_series(1,2) AS a) AS s
WHERE p.a=s.a;
Hash Join
Hash Cond: (p.a = a.a)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Function Scan on generate_series a
But with asymmetric join feature we have the plan:
Append
-> Hash Join
Hash Cond: (p_1.a = a.a)
-> Seq Scan on p1 p_1
-> Hash
-> Function Scan on generate_series a
-> Hash Join
Hash Cond: (p_2.a = a.a)
-> Seq Scan on p2 p_2
-> Hash
-> Function Scan on generate_series a
-> Hash Join
Hash Cond: (p_3.a = a.a)
-> Seq Scan on p3 p_3
-> Hash
-> Function Scan on generate_series a
In the case of FDW-sharding it means that if we can prove that the inner
relation is independent from the execution server, we can push-down
these joins and execute it in parallel.
--
regards,
Andrey Lepikhov
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-04-30 03:12:47 | Re: Replication slot stats misgivings |
Previous Message | David Rowley | 2021-04-30 02:50:33 | Re: Result Cache node shows per-worker info even for workers not launched |