Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2024-03-03 05:43:05
Message-ID: CANWCAZYJKFFQcv3bioNufUq03EDggBTDSonvmqAuMi8ADJ=1jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 1, 2024 at 3:01 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Feb 29, 2024 at 8:43 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:

> > + ts->rt_context = AllocSetContextCreate(CurrentMemoryContext,
> > + "tidstore storage",
> >
> > "tidstore storage" sounds a bit strange -- maybe look at some other
> > context names for ideas.
>
> Agreed. How about "tidstore's radix tree"?

That might be okay. I'm now thinking "TID storage". On that note, one
improvement needed when we polish tidstore.c is to make sure it's
spelled "TID" in comments, like other files do already.

> > - leaf.alloc = (RT_PTR_ALLOC) MemoryContextAlloc(tree->leaf_ctx, allocsize);
> > + leaf.alloc = (RT_PTR_ALLOC) MemoryContextAlloc(tree->leaf_ctx != NULL
> > + ? tree->leaf_ctx
> > + : tree->context, allocsize);
> >
> > Instead of branching here, can we copy "context" to "leaf_ctx" when
> > necessary (those names should look more like eachother, btw)? I think
> > that means anything not covered by this case:
> >
> > +#ifndef RT_VARLEN_VALUE_SIZE
> > + if (sizeof(RT_VALUE_TYPE) > sizeof(RT_PTR_ALLOC))
> > + tree->leaf_ctx = SlabContextCreate(ctx,
> > + RT_STR(RT_PREFIX) "radix_tree leaf contex",
> > + RT_SLAB_BLOCK_SIZE(sizeof(RT_VALUE_TYPE)),
> > + sizeof(RT_VALUE_TYPE));
> > +#endif /* !RT_VARLEN_VALUE_SIZE */
> >
> > ...also, we should document why we're using slab here. On that, I
> > don't recall why we are? We've never had a fixed-length type test case
> > on 64-bit, so it wasn't because it won through benchmarking. It seems
> > a hold-over from the days of "multi-value leaves". Is it to avoid the
> > possibility of space wastage with non-power-of-two size types?
>
> Yes, it matches my understanding.

There are two issues quoted here, so not sure if you mean both or only
the last one...

For the latter, I'm not sure it makes sense to have code and #ifdef's
to force slab for large-enough fixed-length values just because we
can. There may never be such a use-case anyway. I'm also not against
it, either, but it seems like a premature optimization.

> > For this stanza that remains unchanged:
> >
> > for (int i = 0; i < RT_SIZE_CLASS_COUNT; i++)
> > {
> > MemoryContextDelete(tree->node_slabs[i]);
> > }
> >
> > if (tree->leaf_ctx)
> > {
> > MemoryContextDelete(tree->leaf_ctx);
> > }
> >
> > ...is there a reason we can't just delete tree->ctx, and let that
> > recursively delete child contexts?
>
> I thought that considering the RT_CREATE doesn't create its own memory
> context but just uses the passed context, it might be a bit unusable
> to delete the passed context in the radix tree code. For example, if a
> caller creates a radix tree (or tidstore) on a memory context and
> wants to recreate it again and again, he also needs to re-create the
> memory context together. It might be okay if we leave comments on
> RT_CREATE as a side effect, though. This is the same reason why we
> don't destroy tree->dsa in RT_FREE(). And, as for RT_FREE_RECURSE(),

Right, I should have said "reset". Resetting a context will delete
it's children as well, and seems like it should work to reset the tree
context, and we don't have to know whether that context actually
contains leaves at all. That should allow copying "tree context" to
"leaf context" in the case where we have no special context for
leaves.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-03-03 06:10:57 Re: Shared detoast Datum proposal
Previous Message Dian Fay 2024-03-03 04:13:12 [docs] revise ORDER BY documentation