Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-01-29 07:29:10
Message-ID: CAD21AoDUHfgruJNJr-y3oOLbUG+Fhruf0mChdKBy=i0XG_XPyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 26, 2024 at 11:05 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Jan 24, 2024 at 3:42 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> >
> > On Tue, Jan 23, 2024 at 10:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > The new patches probably need to be polished but the VacDeadItemInfo
> > > idea looks good to me.
> >
> > That idea looks good to me, too. Since you already likely know what
> > you'd like to polish, I don't have much to say except for a few
> > questions below. I also did a quick sweep through every patch, so some
> > of these comments are unrelated to recent changes:
>
> Thank you!
>
> >
> > +/*
> > + * Calculate the slab blocksize so that we can allocate at least 32 chunks
> > + * from the block.
> > + */
> > +#define RT_SLAB_BLOCK_SIZE(size) \
> > + Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * (size), (size) * 32)
> >
> > The first parameter seems to be trying to make the block size exact,
> > but that's not right, because of the chunk header, and maybe
> > alignment. If the default block size is big enough to waste only a
> > tiny amount of space, let's just use that as-is.
>
> Agreed.
>

As of v55 patch, the sizes of each node class are:

- node4: 40 bytes
- node16_lo: 168 bytes
- node16_hi: 296 bytes
- node48: 784 bytes
- node256: 2088 bytes

If we use SLAB_DEFAULT_BLOCK_SIZE (8kB) for each node class, we waste
(approximately):

- node4: 32 bytes
- node16_lo: 128 bytes
- node16_hi: 200 bytes
- node48: 352 bytes
- node256: 1928 bytes

We might want to calculate a better slab block size for node256 at least.

> >
> > + * TODO: The caller must be certain that no other backend will attempt to
> > + * access the TidStore before calling this function. Other backend must
> > + * explicitly call TidStoreDetach() to free up backend-local memory associated
> > + * with the TidStore. The backend that calls TidStoreDestroy() must not call
> > + * TidStoreDetach().
> >
> > Do we need to do anything now?
>
> No, will remove it.
>

I misunderstood something. I think the above statement is still true
but we don't need to do anything at this stage. It's a typical usage
that the leader destroys the shared data after confirming all workers
are detached. It's not a TODO but probably a NOTE.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2024-01-29 07:32:06 Re: Network failure may prevent promotion
Previous Message Bharath Rupireddy 2024-01-29 07:27:00 Re: New Table Access Methods for Multi and Single Inserts