Re: [PoC] Reducing planning time when tables have many partitions

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2024-03-06 14:16:38
Message-ID: CAExHW5s=3PZjkskYcfO_72=B-Q5kmXZ9meP8DhXiHbBSC-fD3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Yuya

On Wed, Feb 28, 2024 at 4:48 PM Yuya Watari <watari(dot)yuya(at)gmail(dot)com> wrote:

> Hello,
>
> On Tue, Feb 13, 2024 at 6:19 AM Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
> wrote:
> >
> > Yes, it is working correctly now with the assertion check. I suppose
> > it's better to add this code with an additional comment and a
> > recommendation for other developers
> > to use it for checking in case of manipulations with the list of
> > equivalences.
>
> Thank you for your reply and advice. I have added this assertion so
> that other developers can use it in the future.
>
> I also merged recent changes and attached a new version, v24. Since
> this thread is getting long, I will summarize the patches.
>
>
>
I repeated my experiments in [1]. I ran 2, 3, 4, 5-way self-joins on a
partitioned table with 1000 partitions.

Planning time measurement
---------------------------------------
Without patch with an assert enabled build and enable_partitionwise_join =
false, those joins took 435.31 ms, 1629.16 ms, 4701.59 ms and 11976.69 ms
respectively.
Keeping other things the same, with the patch, they took 247.33 ms, 1318.57
ms, 6960.31 ms and 28463.24 ms respectively.
Those with enable_partitionwise_join = true are 488.73 ms, 2102.12 ms,
6906.02 ms and 21300.77 ms respectively without the patch.
And with the patch, 277.22 ms, 1542.48 ms, 7879.35 ms, and 31826.39 ms.

Without patch without assert enabled build and enable_partitionwise_join =
false, the joins take 298.43 ms, 1179.15 ms, 3518.84 ms and 9149.76 ms
respectively.
Keeping other things the same, with the patch, the joins take 65.70 ms,
131.29 ms, 247.67 ms and 477.74 ms respectively.
Those with enable_partitionwise_join = true are 348.48 ms, 1576.11 ms,
5417.98 and 17433.65 ms respectively without the patch.
And with the patch 95.15 ms, 333.99 ms, 1084.06 ms, and 3609.42 ms.

Memory usage measurement
---------------------------------------
Without patch, with an assert enabled build and enable_partitionwise_join =
false, memory used is 19 MB, 45 MB, 83 MB and 149 MB respectively.
Keeping other things the same, with the patch, memory used is 23 MB, 66 MB,
159 MB and 353 MB respectively.
That with enable_partitionwise_join = true is 40 MB, 151 MB, 464 MB and
1663 MB respectively.
And with the patch it is 44 MB, 172 MB, 540 MB and 1868 MB respectively.

Without patch without assert enabled build and enable_partitionwise_join =
false, memory used is 17 MB, 41 MB, 77 MB, and 140 MB resp.
Keeping other things the same with the patch memory used is 21 MB, 62 MB,
152 MB and 341 MB resp.
That with enable_partitionwise_join = true is 37 MB, 138 MB, 428 MB and
1495 MB resp.
And with the patch it is 42 MB, 160 MB, 496 MB and 1705 MB resp.

here's summary of observations
1. The patch improves planning time significantly (3X to 20X) and the
improvement increases with the number of tables being joined.
2. In the assert enabled build the patch slows down (in comparison to HEAD)
planning with higher number of tables in the join. You may want to
investigate this. But this is still better than my earlier measurements.
3. The patch increased memory consumption by planner. But the numbers have
improved since my last measurement. Still it will be good to investigate
what causes this extra memory consumption.
4. Generally with the assert enabled build planner consumes more memory
with or without patch. From my previous experience this might be due to
Bitmapset objects created within Assert() calls.

Does v24-0002 have any relation/overlap with my patches to reduce memory
consumed by RestrictInfos? Those patches have code to avoid creating
duplicate RestrictInfos (including commuted RestrictInfos) from ECs. [2]

[1]
https://www.postgresql.org/message-id/CAExHW5uVZ3E5RT9cXHaxQ_DEK7tasaMN=D6rPHcao5gcXanY5w@mail.gmail.com
[2]
https://www.postgresql.org/message-id/CAExHW5tEvzM%3D%2BLpN%3DyhU%2BP33D%2B%3D7x6fhzwDwNRM971UJunRTkQ%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-03-06 15:03:59 Re: Potential issue in ecpg-informix decimal converting functions
Previous Message Masahiko Sawada 2024-03-06 14:13:17 Re: [PoC] Improve dead tuple storage for lazy vacuum