Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)
Date: 2023-10-31 05:36:48
Message-ID: CAExHW5tREjmH9qMTVLR0fjiCojjwfgqsax4RUrxxmaQKWbLWZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 8, 2023 at 3:22 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Fri, Jul 28, 2023 at 3:16 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi Tom, Richard,
> >
> > On Mon, Jul 24, 2023 at 8:17 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > >
> > > Thanks for pushing it!
> >
> > With this fix, I saw a noticeable increase in the memory consumption
> > of planner. I was running experiments mentioned in [1] The reason is
> > the Bitmapset created by bms_union() are not freed during planning and
> > when there are thousands of child joins involved, bitmapsets takes up
> > a large memory and there any a large number of bitmaps.
> >
> > Attached 0002 patch fixes the memory consumption by calculating
> > appinfos only once and using them twice. The number look like below
> >
> > Number of tables joined | without patch | with patch |
> > ------------------------------------------------------
> > 2 | 40.8 MiB | 40.3 MiB |
> > 3 | 151.6 MiB | 146.9 MiB |
> > 4 | 463.9 MiB | 445.5 MiB |
> > 5 | 1663.9 MiB | 1563.3 MiB |
> >
> > The memory consumption is prominent at higher number of joins as that
> > exponentially increases the number of child joins.
> >
> > Attached setup.sql and queries.sql and patch 0001 were used to measure
> > memory similar to [1].
> >
> > I don't think it's a problem with the patch itself. We should be able
> > to use Bitmapset APIs similar to what patch is doing. But there's a
> > problem with our Bitmapset implementation. It's not space efficient
> > for thousands of partitions. A quick calculation reveals this.
> >
> > If the number of partitions is 1000, the matching partitions will
> > usually be 1000, 2000, 3000 and so on. Thus the bitmapset represnting
> > the relids will be {b 1000, 2000, 3000, ...}. To represent a single
> > 1000th digit current Bitmapset implementation will allocate 1000/8 =
> > 125 bytes of memory. A 5 way join will require 125 * 5 = 625 bytes of
> > memory. This is even true for lower relid numbers since they will be
> > 1000 bits away e.g. (1, 1001, 2001, 3001, ...). So 1000 such child
> > joins require 625KiB memory. Doing this as many times as the number of
> > joins we get 120 * 625 KiB = 75 MiB which is closer to the memory
> > difference we see above.
> >
> > Even if we allocate a list to hold 5 integers it will not take 625
> > bytes. I think we need to improve our Bitmapset representation to be
> > efficient in such cases. Of course whatever representation we choose
> > has to be space efficient for a small number of tables as well and
> > should gel well with our planner logic. So I guess some kind of
> > dynamic representation which changes the underlying layout based on
> > the contents is required. I have looked up past hacker threads to see
> > if this has been discussed previously.
> >
> > I don't think this is the thread to discuss it and also I am not
> > planning to work on it in the immediate future. But I thought I would
> > mention it where I found it.
> >
> > [1] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com
> >
>
> Adding this small patch to the commitfest in case somebody finds it
> worth fixing this specific memory consumption. With a new subject.

Rebased patches.
0001 - is same as the squashed version of patches at
https://www.postgresql.org/message-id/CAExHW5sCJX7696sF-OnugAiaXS=Ag95=-m1cSrjcmyYj8Pduuw@mail.gmail.com.
0002 is the actual fix described earlier

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
0002-Reuse-child_relids-bitmapset-in-partitionwi-20231031.patch text/x-patch 4.7 KB
0001-Report-memory-used-for-planning-a-query-in--20231031.patch text/x-patch 14.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-10-31 05:51:12 Re: Synchronizing slots from primary to standby
Previous Message Andrei Lepikhov 2023-10-31 05:36:42 Re: A performance issue with Memoize