Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-03-21 07:02:05
Message-ID: CAD21AoBY8nxRoYx8JStNfe-sui=rS67M6JBBbad5NwaO1bgLuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 21, 2024 at 3:10 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Mar 21, 2024 at 12:40 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 21, 2024 at 9:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Mar 20, 2024 at 11:19 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > > > Are they (the blocks to be precise) really out of order? The VALUES
> > > > statement is ordered, but after inserting it does not output that way.
> > > > I wondered if this is platform independent, but CI and our dev
> > > > machines haven't failed this test, and I haven't looked into what
> > > > determines the order. It's easy enough to hide the blocks if we ever
> > > > need to, as we do elsewhere...
> > >
> > > It seems not necessary as such a test is already covered by
> > > test_radixtree. I've changed the query to hide the output blocks.
> >
> > Okay.
> >
> > > The buildfarm has been all-green so far.
> >
> > Great!
> >
> > > I've attached the latest vacuum improvement patch.
> > >
> > > I just remembered that the tidstore cannot still be used for parallel
> > > vacuum with minimum maintenance_work_mem. Even when the shared
> > > tidstore is empty, its memory usage reports 1056768 bytes, a bit above
> > > 1MB (1048576 bytes). We need something discussed on another thread[1]
> > > in order to make it work.
> >
> > For exactly this reason, we used to have a clamp on max_bytes when it
> > was internal to tidstore, so that it never reported full when first
> > created, so I guess that got thrown away when we got rid of the
> > control object in shared memory. Forcing callers to clamp their own
> > limits seems pretty unfriendly, though.
>
> Or we can have a new function for dsa.c to set the initial and max
> segment size (or either one) to the existing DSA area so that
> TidStoreCreate() can specify them at creation. In shared TidStore
> cases, since all memory required by shared radix tree is allocated in
> the passed-in DSA area and the memory usage is the total segment size
> allocated in the DSA area, the user will have to prepare a DSA area
> only for the shared tidstore. So we might be able to expect that the
> DSA passed-in to TidStoreCreate() is empty and its segment sizes can
> be adjustable.

Yet another idea is that TidStore creates its own DSA area in
TidStoreCreate(). That is, In TidStoreCreate() we create a DSA area
(using dsa_create()) and pass it to RT_CREATE(). Also, we need a new
API to get the DSA area. The caller (e.g. parallel vacuum) gets the
dsa_handle of the DSA and stores it in the shared memory (e.g. in
PVShared). TidStoreAttach() will take two arguments: dsa_handle for
the DSA area and dsa_pointer for the shared radix tree. This idea
still requires controlling min/max segment sizes since dsa_create()
uses the 1MB as the initial segment size. But the TidStoreCreate()
would be more user friendly.

I've attached a PoC patch for discussion.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
tidstore_creates_dsa.patch.nocfbot application/octet-stream 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message alex work 2024-03-21 07:10:06 Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Previous Message Andrew Dunstan 2024-03-21 06:56:00 Re: WIP Incremental JSON Parser