Re: [PoC] Improve dead tuple storage for lazy vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-07-06 13:43:09
Message-ID: CAD21AoC49cDu0u+ybbS_e8zvvRaOttpL2dxyx5NjKXqTB5HuQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 5, 2022 at 5:09 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2022-07-05 16:33:17 +0900, Masahiko Sawada wrote:
> > On Tue, Jul 5, 2022 at 6:18 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > A datum value is convenient to represent both a pointer and a value so
> > I used it to avoid defining node types for inner and leaf nodes
> > separately.
>
> I'm not convinced that's a good goal. I think we're going to want to have
> different key and value types, and trying to unify leaf and inner nodes is
> going to make that impossible.
>
> Consider e.g. using it for something like a buffer mapping table - your key
> might be way too wide to fit it sensibly into 64bit.

Right. It seems to be better to have an interface so that the user of
the radix tree can specify the arbitrary key size (and perhaps value
size too?) on creation. And we can have separate leaf node types that
have values instead of pointers. If the value size is less than
pointer size, we can have values within leaf nodes but if it’s bigger
probably the leaf node can have pointers to memory where to store the
value.

>
>
> > Since a datum could be 4 bytes or 8 bytes depending it might not be good for
> > some platforms.
>
> Right - thats another good reason why it's problematic. A lot of key types
> aren't going to be 4/8 bytes dependent on 32/64bit, but either / or.
>
>
> > > > +void
> > > > +radix_tree_insert(radix_tree *tree, uint64 key, Datum val, bool *found_p)
> > > > +{
> > > > + int shift;
> > > > + bool replaced;
> > > > + radix_tree_node *node;
> > > > + radix_tree_node *parent = tree->root;
> > > > +
> > > > + /* Empty tree, create the root */
> > > > + if (!tree->root)
> > > > + radix_tree_new_root(tree, key, val);
> > > > +
> > > > + /* Extend the tree if necessary */
> > > > + if (key > tree->max_val)
> > > > + radix_tree_extend(tree, key);
> > >
> > > FWIW, the reason I used separate functions for these in the prototype is that
> > > it turns out to generate a lot better code, because it allows non-inlined
> > > function calls to be sibling calls - thereby avoiding the need for a dedicated
> > > stack frame. That's not possible once you need a palloc or such, so splitting
> > > off those call paths into dedicated functions is useful.
> >
> > Thank you for the info. How much does using sibling call optimization
> > help the performance in this case? I think that these two cases are
> > used only a limited number of times: inserting the first key and
> > extending the tree.
>
> It's not that it helps in the cases moved into separate functions - it's that
> not having that code in the "normal" paths keeps the normal path faster.

Thanks, understood.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-06 14:01:00 Re: AIX support - alignment issues
Previous Message Drouvot, Bertrand 2022-07-06 13:30:37 Re: Minimal logical decoding on standbys