Re: [HACKERS] advanced partition matching algorithm for partition-wise join

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
Cc: amul sul <sulamul(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Date: 2020-01-28 04:39:03
Message-ID: 4F6838E1-0FE8-4B92-B126-809866A064DE@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Tue, Dec 10, 2019 at 7:30 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
>> I don't see any issues in the latest version, but I think we
>> need to polish the patch, so I'll do that.
>
> I noticed some issues. :-( I think we should address it before
> polishing the patch. One thing I noticed is: the patch heavily
> modifies the existing test cases in partition_join.sql to test the new
> partition-matching algorithm, but I think we should leave those test
> cases alone because we would handle the exiting test cases (except one
> negative test case) as before (see the try_partitionwise_join()
> change in the patch), so those test cases would be still needed to
> test that. Attached is a proposed patch for that
> (v30-0001-Improve-partition-matching-for-partitionwise-join.patch)
> that 1) avoids modifying the existing test cases and 2) adds a
> slightly modified version of the test cases proposed in the previous
> patch to test the new algorithm. Though I omitted some test cases
> that seem redundant to me and added a bit more test cases involving
> NULL partitions and/or default partitions. The elapsed time to run
> the partition_join.sql regression test increased from 741 ms (HEAD) to
> 1086 ms in my environment, but I think that would be acceptable. I
> fixed one white space issue, but other than that, no code/comment
> changes.
>
> Another thing I noticed while working on the above is: the patch fails
> to apply PWJ to this case:
>
> CREATE TABLE plt1_ad (a int, b int, c text) PARTITION BY LIST (c);
> CREATE TABLE plt1_ad_p1 PARTITION OF plt1_ad FOR VALUES IN ('0001', '0003');
> CREATE TABLE plt1_ad_p2 PARTITION OF plt1_ad FOR VALUES IN ('0004', '0006');
> CREATE TABLE plt1_ad_p3 PARTITION OF plt1_ad FOR VALUES IN ('0008', '0009');
> CREATE TABLE plt1_ad_extra PARTITION OF plt1_ad FOR VALUES IN (NULL);
> INSERT INTO plt1_ad SELECT i, i, to_char(i % 10, 'FM0000') FROM
> generate_series(1, 299) i WHERE i % 10 NOT IN (0, 2, 5, 7);
> INSERT INTO plt1_ad VALUES (-1, -1, NULL);
> ANALYZE plt1_ad;
> CREATE TABLE plt2_ad (a int, b int, c text) PARTITION BY LIST (c);
> CREATE TABLE plt2_ad_p1 PARTITION OF plt2_ad FOR VALUES IN ('0002', '0003');
> CREATE TABLE plt2_ad_p2 PARTITION OF plt2_ad FOR VALUES IN ('0004', '0006');
> CREATE TABLE plt2_ad_p3 PARTITION OF plt2_ad FOR VALUES IN ('0007', '0009');
> CREATE TABLE plt2_ad_extra PARTITION OF plt2_ad FOR VALUES IN (NULL);
> INSERT INTO plt2_ad SELECT i, i, to_char(i % 10, 'FM0000') FROM
> generate_series(1, 299) i WHERE i % 10 NOT IN (0, 1, 5, 8);
> INSERT INTO plt2_ad VALUES (-1, -1, NULL);
> ANALYZE plt2_ad;
>
> EXPLAIN (COSTS OFF)
> SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_ad t1 LEFT JOIN plt2_ad t2 ON
> (t1.a = t2.a AND t1.c = t2.c) WHERE t1.b < 10 ORDER BY t1.a;
> QUERY PLAN
> --------------------------------------------------------
> Sort
> Sort Key: t1.a
> -> Hash Right Join
> Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
> -> Append
> -> Seq Scan on plt2_ad_p1 t2_1
> -> Seq Scan on plt2_ad_p2 t2_2
> -> Seq Scan on plt2_ad_p3 t2_3
> -> Seq Scan on plt2_ad_extra t2_4
> -> Hash
> -> Append
> -> Seq Scan on plt1_ad_p1 t1_1
> Filter: (b < 10)
> -> Seq Scan on plt1_ad_p2 t1_2
> Filter: (b < 10)
> -> Seq Scan on plt1_ad_p3 t1_3
> Filter: (b < 10)
> -> Seq Scan on plt1_ad_extra t1_4
> Filter: (b < 10)
> (19 rows)
>
> because merge_null_partitions() does not consider matching the NULL
> partitions from both sides, but matches the NULL partition on the
> plt1_ad side and a dummy partition, resulting in a non-PWJ plan (see
> [1]). I overlooked this case when modifying that function. :-(
> Another patch attached to fix this issue
> (v30-0002-Fix-handling-of-NULL-partitions.patch). (We would not need
> to fix this, if we could handle the case where a dummy partition is on
> the nullable side of an outer join [1], but we can't, so I think it
> would be a good idea at least for now to match the NULL partitions
> from both sides to do PWJ.)
>
> Best regards,
> Etsuro Fujita
>
> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7ad6498fd5a654de6e743814c36cf619a3b5ddb6
>
> <v30-0001-Improve-partition-matching-for-partitionwise-join.patch><v30-0002-Fix-handling-of-NULL-partitions.patch>

Fujita-san,

With respect to these two patches: They apply, compile, and pass all the regression tests. The code looks reasonable.

There is stray whitespace in v30-0002:

src/backend/partitioning/partbounds.c:4557: space before tab in indent.
+ outer_null_unmerged = true;

I have added tests checking correctness and showing some partition pruning limitations. Find three patches, attached.

The v31-0001-… patch merely applies your patches as a starting point for the next two patches. It is your work, not mine.

The v31-0002-… patch adds more regression tests for range partitioning. The commit message contains my comments about that.

The v31-0003-… patch adds more regression tests for list partitioning, and again, the commit message contains my comments about that.

I hope this review is helpful.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
v31-0001-Applying-Etsuro-Fujita-s-patches.patch application/octet-stream 182.3 KB
v31-0002-Extending-partition-join-tests.patch application/octet-stream 23.4 KB
v31-0003-Further-extending-partition-join-tests.patch application/octet-stream 32.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-01-28 05:01:55 Re: pause recovery if pitr target not reached
Previous Message Andrey Lepikhov 2020-01-28 04:10:07 Re: Removing unneeded self joins