Re: [PoC] Reducing planning time when tables have many partitions

From: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2022-08-08 11:27:46
Message-ID: CAJ2pMkZKFVmPHovyyueBpwb_nYYVk2+GaDqgzxZVnjkvxgtXog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On Thu, Jul 28, 2022 at 6:35 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I've attached a very half-done patch that might help you get started
> on this.

Thank you so much for creating the patch. I have implemented your
approach and attached a new version of the patch to this email.

If you have already applied David's patch, please start the 'git am'
command from 0002-Fix-bugs.patch. All regression tests passed with
this patch on my environment.

1. Optimizations

The new optimization techniques utilizing Bitmapsets are implemented
as the following functions in src/include/optimizer/paths.h.

* get_eclass_members_indexes_for_relids()
* get_eclass_members_indexes_for_not_children()
* get_eclass_members_indexes_for_relids_or_not_children()
* get_eclass_members_indexes_for_subsets_of_relids()
* get_eclass_members_indexes_for_subsets_of_relids_or_not_children()
// I think the names of these functions need to be reconsidered.

These functions intersect ec->ec_member_indexes and some Bitmapset and
return indexes of EquivalenceMembers that we want to get.

The implementation of the first three functions listed above is
simple. However, the rest functions regarding the bms_is_subset()
condition are a bit more complicated. I have optimized this case based
on Tom's idea. The detailed steps are as follows.

I. Intersect ec->ec_member_indexes and the Bitmapset in RelOptInfo.
This intersection set is a candidate for the EquivalenceMembers to be
retrieved.
II. Remove from the candidate set the members that do not satisfy the
bms_is_subset().

Optimization for EquivalenceMembers with a constant value is one of
the future works.

2. Experimental Results

I conducted an experiment by using the original query, which is
attached to this email. You can reproduce this experiment by the
following commands.

=====
psql -f create-tables.sql
psql -f query.sql
=====

The following table and the attached figure describe the experimental result.

Planning time of "query.sql" (n = the number of partitions)
----------------------------------------------------------------
n | Master (ms) | Patched (ms) | Speedup (%) | Speedup (ms)
----------------------------------------------------------------
1 | 0.809 | 0.760 | 6.09% | 0.049
2 | 0.799 | 0.811 | -1.53% | -0.012
4 | 1.022 | 0.989 | 3.20% | 0.033
8 | 1.357 | 1.325 | 2.32% | 0.032
16 | 2.149 | 2.026 | 5.69% | 0.122
32 | 4.357 | 3.925 | 9.91% | 0.432
64 | 9.543 | 7.543 | 20.96% | 2.000
128 | 27.195 | 15.823 | 41.82% | 11.372
256 | 130.207 | 52.664 | 59.55% | 77.542
384 | 330.642 | 112.324 | 66.03% | 218.318
512 | 632.009 | 197.957 | 68.68% | 434.052
640 | 1057.193 | 306.861 | 70.97% | 750.333
768 | 1709.914 | 463.628 | 72.89% | 1246.287
896 | 2531.685 | 738.827 | 70.82% | 1792.858
1024 | 3516.592 | 858.211 | 75.60% | 2658.381
----------------------------------------------------------------

-------------------------------------------------------
n | Stddev of Master (ms) | Stddev of Patched (ms)
-------------------------------------------------------
1 | 0.085 | 0.091
2 | 0.061 | 0.091
4 | 0.153 | 0.118
8 | 0.203 | 0.107
16 | 0.150 | 0.153
32 | 0.313 | 0.242
64 | 0.411 | 0.531
128 | 1.263 | 1.109
256 | 5.592 | 4.714
384 | 17.423 | 6.625
512 | 20.172 | 7.188
640 | 40.964 | 26.246
768 | 61.924 | 31.741
896 | 66.481 | 27.819
1024 | 80.950 | 49.162
-------------------------------------------------------

The speed up with the new patch was up to 75.6% and 2.7 seconds. The
patch achieved a 21.0% improvement even with 64 partitions, which is a
realistic size. We can conclude that this optimization is very
effective in workloads with highly partitioned tables.

Performance degradation occurred only when the number of partitions
was 2, and its degree was 1.53% or 12 microseconds. This degradation
is the difference between the average planning times of 10000 runs.
Their standard deviations far exceed the difference in averages. It is
unclear whether this degradation is an error.

=====

I'm looking forward to your comments.

--
Best regards,
Yuya Watari

Attachment Content-Type Size
0002-Fix-bugs.patch application/octet-stream 7.4 KB
0003-Implement-optimizations-based-on-Bitmapset.patch application/octet-stream 34.1 KB
0001-EClass-member-speedup.patch application/octet-stream 52.0 KB
figure.png image/png 233.8 KB
query.sql application/octet-stream 307 bytes
create-tables.sql application/octet-stream 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marcos Pegoraro 2022-08-08 11:34:08 Re: bug on log generation ?
Previous Message Drouvot, Bertrand 2022-08-08 10:51:20 Re: Patch proposal: New hooks in the connection path