Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-03-21 09:02:47
Message-ID: CAD21AoCMJy-tdD_HgB+E3A2azhDM0jGsahbEVnsS=f-RC1Gw8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 21, 2024 at 4:35 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Thu, Mar 21, 2024 at 1:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> > Or we can have a new function for dsa.c to set the initial and max
> > segment size (or either one) to the existing DSA area so that
> > TidStoreCreate() can specify them at creation.
>
> I didn't like this very much, because it's splitting an operation
> across an API boundary. The caller already has all the information it
> needs when it creates the DSA. Straw man proposal: it could do the
> same for local memory, then they'd be more similar. But if we made
> local contexts the responsibility of the caller, that would cause
> duplication between creating and resetting.

Fair point.

>
> > In shared TidStore
> > cases, since all memory required by shared radix tree is allocated in
> > the passed-in DSA area and the memory usage is the total segment size
> > allocated in the DSA area
>
> ...plus apparently some overhead, I just found out today, but that's
> beside the point.
>
> On Thu, Mar 21, 2024 at 2:02 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > Yet another idea is that TidStore creates its own DSA area in
> > TidStoreCreate(). That is, In TidStoreCreate() we create a DSA area
> > (using dsa_create()) and pass it to RT_CREATE(). Also, we need a new
> > API to get the DSA area. The caller (e.g. parallel vacuum) gets the
> > dsa_handle of the DSA and stores it in the shared memory (e.g. in
> > PVShared). TidStoreAttach() will take two arguments: dsa_handle for
> > the DSA area and dsa_pointer for the shared radix tree. This idea
> > still requires controlling min/max segment sizes since dsa_create()
> > uses the 1MB as the initial segment size. But the TidStoreCreate()
> > would be more user friendly.
>
> This seems like an overall simplification, aside from future size
> configuration, so +1 to continue looking into this. If we go this
> route, I'd like to avoid a boolean parameter and cleanly separate
> TidStoreCreateLocal() and TidStoreCreateShared(). Every operation
> after that can introspect, but it's a bit awkward to force these cases
> into the same function. It always was a little bit, but this change
> makes it more so.

I've looked into this idea further. Overall, it looks clean and I
don't see any problem so far in terms of integration with lazy vacuum.
I've attached three patches for discussion and tests.

- 0001 patch makes lazy vacuum use of tidstore.
- 0002 patch makes DSA init/max segment size configurable (borrowed
from another thread).
- 0003 patch makes TidStore create its own DSA area with init/max DSA
segment adjustment (PoC patch).

One thing unclear to me is that this idea will be usable even when we
want to use the tidstore for parallel bitmap scan. Currently, we
create a shared tidbitmap on a DSA area in ParallelExecutorInfo. This
DSA area is used not only for tidbitmap but also for parallel hash
etc. If the tidstore created its own DSA area, parallel bitmap scan
would have to use the tidstore's DSA in addition to the DSA area in
ParallelExecutorInfo. I'm not sure if there are some differences
between these usages in terms of resource manager etc. It seems no
problem but I might be missing something.

Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v77-0003-PoC-Make-shared-TidStore-create-its-own-DSA-area.patch application/octet-stream 14.3 KB
v77-0002-Make-DSA-initial-and-maximum-segment-size-config.patch application/octet-stream 9.6 KB
v77-0001-Use-TidStore-for-dead-tuple-TIDs-storage-during-.patch application/octet-stream 43.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2024-03-21 09:13:46 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Peter Eisentraut 2024-03-21 08:44:01 Re: make dist using git archive