Re: Asymmetric partition-wise JOIN

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Aleksander Alekseev <afiskon(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, KaiGai Kohei <kaigai(at)heterodb(dot)com>
Subject: Re: Asymmetric partition-wise JOIN
Date: 2022-01-17 10:42:10
Message-ID: 88bc3c051d285653215393a56bdf3056@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrey Lepikhov писал 2021-09-15 09:31:
> On 14/9/21 11:37, Andrey V. Lepikhov wrote:
>> Thank you for this good catch!
>> The problem was in the adjust_child_relids_multilevel routine. The
>> tmp_result variable sometimes points to original required_outer.
>> This patch adds new ways which optimizer can generate plans. One
>> possible way is optimizer reparameterizes an inner by a plain relation
>> from the outer (maybe as a result of join of the plain relation and
>> partitioned relation). In this case we have to compare tmp_result with
>> original pointer to realize, it was changed or not.
>> The patch in attachment fixes this problem. Additional regression test
>> added.
>>
> I thought more and realized there isn't necessary to recurse in the
> adjust_child_relids_multilevel() routine if required_outer contains
> only
> normal_relids.
> Also, regression tests were improved a bit.

Hi.
The patch does not longer apply cleanly, so I rebased it. Attaching
rebased version.
I've looked through it once again and have several questions.

1) In adjust_appendrel_attrs_multilevel(), can it happen that
child_relids is zero-length list (in this case pfree's will fail)? It
seems, no, but should we at least assert this? Note that in
adjust_appendrel_attrs() we add logic for nappinfos being 0.

2) In try_asymmetric_partitionwise_join() we state that 'Asymmetric join
isn't needed if the append node has only one child'. This is not
completely correct. Asymmetric join with one partition can be
advantageous when JOIN(A, UNION(B)) is more expensive than UNION(JOIN
(A, B)). The later is true, for example, when we join partitioned table
having foreign partitions with another foreign table and only one
partition is left.
Let's take the attached case (foreign_join.sql). When
list_length(append_path->subpaths) > 1 is present, we get the following
plan

set enable_partitionwise_join = on;

explain SELECT t1.a,t2.b FROM fprt1 t1 INNER JOIN ftprt2_p1 t2 ON (t1.a
= t2.b) WHERE t1.a < 250 AND t2.c like '%0004' ORDER BY 1,2;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=208.65..208.69 rows=17 width=8)
Sort Key: t1.a
-> Hash Join (cost=202.60..208.30 rows=17 width=8)
Hash Cond: (t1.a = t2.b)
-> Foreign Scan on ftprt1_p1 t1 (cost=100.00..105.06 rows=125
width=4)
-> Hash (cost=102.39..102.39 rows=17 width=4)
-> Foreign Scan on ftprt2_p1 t2 (cost=100.00..102.39
rows=17 width=4)

In case when we change it to list_length(append_path->subpaths) > 0, we
get foreign join and cheaper plan:

explain verbose SELECT t1.a,t2.b FROM fprt1 t1 INNER JOIN ftprt2_p1 t2
ON (t1.a = t2.b) WHERE t1.a < 250 AND t2.c like '%0004' ORDER BY 1,2;

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=106.15..106.19 rows=17 width=8)
Output: t1.a, t2.b
Sort Key: t1.a
-> Foreign Scan (cost=102.26..105.80 rows=17 width=8)
Output: t1.a, t2.b
Relations: (public.ftprt1_p1 t1) INNER JOIN (public.ftprt2_p1
t2)
Remote SQL: SELECT r4.a, r2.b FROM (public.fprt1_p1 r4 INNER
JOIN public.fprt2_p1 r2 ON (((r4.a = r2.b)) AND ((r2.c ~~ '%0004')) AND
((r4.a < 250))))

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment Content-Type Size
v20-0001-Asymmetric-partitionwise-join.patch text/x-diff 41.6 KB
foreign_join.sql text/plain 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema 2022-01-17 11:17:16 Re: Per-table storage parameters for TableAM/IndexAM extensions
Previous Message Bharath Rupireddy 2022-01-17 10:40:13 Re: pg_replslotdata - a tool for displaying replication slot information