Re: [PoC] Reducing planning time when tables have many partitions

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Yuya Watari <watari(dot)yuya(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Zhang Mingli <zmlpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Date: 2023-07-28 09:50:57
Message-ID: CAExHW5uVZ3E5RT9cXHaxQ_DEK7tasaMN=D6rPHcao5gcXanY5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Yuya, Andrey,

On Fri, Jul 28, 2023 at 9:58 AM Andrey Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:

> >>
> > Discovering quality of partition pruning at the stage of execution
> > initialization and using your set of patches I have found some dubious
> > results with performance degradation. Look into the test case in
> > attachment.
> > Here is three queries. Execution times:
> > 1 - 8s; 2 - 30s; 3 - 131s (with your patch set).
> > 1 - 5s; 2 - 10s; 3 - 33s (current master).
> >
> > Maybe it is a false alarm, but on my laptop I see this degradation at
> > every launch.
> Sorry for this. It was definitely a false alarm. In this patch,
> assertion checking adds much overhead. After switching it off, I found
> out that this feature solves my problem with a quick pass through the
> members of an equivalence class. Planning time results for the queries
> from the previous letter:
> 1 - 0.4s, 2 - 1.3s, 3 - 1.3s; (with the patches applied)
> 1 - 5s; 2 - 8.7s; 3 - 22s; (current master).

I measured planning time using my scripts setup.sql and queries.sql
attached to [1] with and without assert build using your patch. The
timings are recorded in the attached spreadsheet. I have following
observations

1. The patchset improves the planning time of queries involving
partitioned tables by an integral factor. Both in case of
partitionwise join and without it. The speedup is 5x to 21x in my
experiment. That's huge.
2. There's slight degradation in planning time of queries involving
unpartitioned tables. But I have seen that much variance usually.
3. assert and debug enabled build shows degradation in planning time
in all the cases.
4. There is substantial memory increase in all the cases. It's
percentage wise predominant when the partitionwise join is not used.

Given that most of the developers run assert enabled builds it would
be good to bring down the degradation there while keeping the
excellent speedup in non-assert builds.
Queries on partitioned tables eat a lot of memory anyways, increasing
that further should be avoided.

I have not studied the patches. But I think the memory increase has to
do with our Bitmapset structure. It's space inefficient when there are
thousands of partitions involved. See my comment at [2]

[1] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com
[2] https://www.postgresql.org/message-id/CAExHW5s4EqY43oB%3Dne6B2%3D-xLgrs9ZGeTr1NXwkGFt2j-OmaQQ%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
planning time measurement.ods application/vnd.oasis.opendocument.spreadsheet 11.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-07-28 10:06:33 Re: Support worker_spi to execute the function dynamically.
Previous Message Ashutosh Bapat 2023-07-28 09:46:57 Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)