Re: [HACKERS] advanced partition matching algorithm for partition-wise join

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: amul sul <sulamul(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Date: 2019-08-28 10:22:04
Message-ID: CAPmGK17C_mVRzEf8VOZfVa82XQJjeZkxF0JXJiUaUO9HHHLJDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 16, 2019 at 10:25 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> It seems that I performed the above tests on an assertion-enabled
> build. :( So I executed the tests one more time. Here are the
> results.
>
> * 2-way self-join of pt: explain analyze select * from pt t0, pt t1
> where t0.a = t1.a;
> - HEAD:
> Planning Time: 0.969 ms
> Execution Time: 13.843 ms
> - with patch:
> Planning Time: 1.720 ms
> Execution Time: 14.393 ms
> - with patch plus attached:
> Planning Time: 1.630 ms
> Execution Time: 14.002 ms
>
> * 4-way self-join of pt: explain analyze select * from pt t0, pt t1,
> pt t2, pt t3 where t0.a = t1.a
>
> and t1.a = t2.a and t2.a = t3.a;
> - HEAD:
> Planning Time: 12.203 ms
> Execution Time: 31.784 ms
> - with patch:
> Planning Time: 32.102 ms
> Execution Time: 32.504 ms
> - with patch plus attached:
> Planning Time: 19.471 ms
> Execution Time: 32.582 ms
>
> * 8-way self-join of pt: explain analyze select * from pt t0, pt t1,
> pt t2, pt t3, pt t4, pt t5, pt t6, pt t7 where t0.a = t1.a and t1.a =
> t2.a and t2.a = t3.a and t3.a = t4.a and t4.a = t5.a and t5.a = t6.a
> and t6.a = t7.a;
> - HEAD:
> Planning Time: 948.131 ms
> Execution Time: 55.645 ms
> - with patch:
> Planning Time: 2939.813 ms
> Execution Time: 56.760 ms
> - with patch plus attached:
> Planning Time: 1108.076 ms
> Execution Time: 55.750 ms
>
> Note: the attached patch still uses the proposed partition matching
> algorithm for these queries. As I said before, these queries don't
> need that algorithm, so we could eliminate the planning overhead
> compared to HEAD, by planning these queries as before, perhaps, but I
> haven't modified the patch as such yet.

I modified the patch further as such. Attached is an updated version
of the patch created on top of the patch in [1]. I did the tests
again using the updated version of the patch. Here are the results:

* 2-way self-join of pt:
Planning Time: 1.043 ms
Execution Time: 13.931 ms

* 4-way self-join of pt:
Planning Time: 12.499 ms
Execution Time: 32.392 ms

* 8-way self-join of pt:
Planning Time: 968.412 ms
Execution Time: 56.328 ms

The planning time for each test case still increased slightly, but IMO
I think that would be acceptable. To see the efficiency of the
attached, I did another testing with test cases that really need the
new partition-matching algorithm:

* explain analyze select * from pt6 t6, pt7 t7 where t6.a = t7.a;
- base patch in [1]
Planning Time: 1.758 ms
Execution Time: 13.977 ms
- with attached
Planning Time: 1.777 ms
Execution Time: 13.959 ms

* explain analyze select * from pt4 t4, pt5 t5, pt6 t6, pt7 t7 where
t4.a = t5.a and t5.a = t6.a and t6.a = t7.a;
- base patch in [1]
Planning Time: 33.201 ms
Execution Time: 32.480 ms
- with attached
Planning Time: 21.019 ms
Execution Time: 32.777 ms

* explain analyze select * from pt0 t0, pt1 t1, pt2 t2, pt3 t3, pt4
t4, pt5 t5, pt6 t6, pt7 t7 where t0.a = t1.a and t1.a = t2.a and t2.a
= t3.a and t3.a = t4.a and t4.a = t5.a and t5.a = t6.a and t6.a =
t7.a;
- base patch in [1]
Planning Time: 3060.000 ms
Execution Time: 55.553 ms
- with attached
Planning Time: 1144.996 ms
Execution Time: 56.233 ms

where pt0, pt1, pt2, pt3, pt4, pt5, pt6 and pt7 are list partitioned
tables that have slighly different list values. (The structure and
list values of ptN are almost the same as that of pt used above, but
ptN's N-th partition ptNpN has an extra list value that pt's N-th
partition ptpN doesn't have.) If anyone is interested in this
testing, I'll send a script file for producing these list partitioned
tables.

About the attached:

* The attached patch modified try_partitionwise_join() so that we call
partition_bounds_equal() then partition_bounds_merge() (if necessary)
to create the partition bounds for the join rel. We don't support for
merging the partition bounds for the hash-partitioning case, so this
makes code to handle the hash-partitioning case in
partition_bounds_merge() completely unnecessary, so I removed that
entirely.

* I removed this assertion in partition_bounds_merge(), because I
think this is covered by two assertions above this.

+ Assert((*outer_parts == NIL || *inner_parts != NIL) &&
+ (*inner_parts == NIL || *outer_parts != NIL));

* (I forgot to mention this in a previous email, but) I removed this
bit of generate_matching_part_pairs(), because we already have the
same check in try_partitionwise_join(), so this bit would be redundant
IIUC.

+ switch (jointype)
+ {
+ case JOIN_INNER:
+ case JOIN_SEMI:
+
+ /*
+ * An inner or semi join can not return any row when the
+ * matching partition on either side is missing. We should
+ * have eliminated all such cases while merging the bounds.
+ */
+ Assert(part1 >= 0 && part2 >= 0);
+ break;
+
+ case JOIN_LEFT:
+ case JOIN_ANTI:
+ Assert(part1 >= 0);
+ if (part2 < 0)
+ merged = false;
+ break;
+
+ case JOIN_FULL:
+ if (part1 < 0 || part2 < 0)
+ merged = false;
+ break;
+
+ default:
+ elog(ERROR, "unrecognized join type: %d", (int) jointype);
+ }

* I added more comments.

If there are no objections, I'll merge the attached with the base patch in [1].

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/CAPmGK177W%2BjNgpM5f_m-vdDKbEBu_%3D3CyPzFjkT_1nzf1Vqn%2BA%40mail.gmail.com

Attachment Content-Type Size
Modify-partition-matching-algorithm-1.patch application/octet-stream 23.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2019-08-28 10:26:49 Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Previous Message Sergei Kornilov 2019-08-28 09:49:46 Re: allow online change primary_conninfo