Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2022-10-06 09:30:52
Message-ID: CAFBsxsG+Kh3HnLa7VYcc-C5gf3z1qOjn+m3Nrhj-bT0ovtoUMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 6, 2022 at 2:53 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:
>
> On Wed, Oct 5, 2022 at 6:40 PM John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
> >
> > This wasn't the focus of your current email, but while experimenting
with v6 I had another thought about local allocation: If we use the default
slab block size of 8192 bytes, then only 3 chunks of size 2088 can fit,
right? If so, since aset and DSA also waste at least a few hundred bytes,
we could store a useless 256-byte slot array within node256. That way,
node128 and node256 share the same start of pointers/values array, so there
would be one less branch for getting that address. In v6,
rt_node_get_values and rt_node_get_children are not inlined (asde: gcc uses
a jump table for 5 kinds but not for 4), but possibly should be, and the
smaller the better.
>
> It would be good for performance but I'm a bit concerned that it's
> highly optimized to the design of aset and DSA. Since size 2088 will
> be currently classed as 2616 in DSA, DSA wastes 528 bytes. However, if
> we introduce a new class of 2304 (=2048 + 256) bytes we cannot store a
> useless 256-byte and the assumption will be broken.

A new DSA class is hypothetical. A better argument against my idea is that
SLAB_DEFAULT_BLOCK_SIZE is arbitrary. FWIW, I looked at the prototype just
now and the slab block sizes are:

Max(pg_nextpower2_32((MAXALIGN(inner_class_info[i].size) + 16) * 32), 1024)

...which would be 128kB for nodemax. I'm curious about the difference.

> > One concern is, handling both local and dsa cases in the same code
requires more (predictable) branches and reduces code density. That might
be a reason in favor of templating to handle each case in its own
translation unit.
>
> Right. We also need to support locking for shared radix tree, which
> would require more branches.

Hmm, now it seems we'll likely want to template local vs. shared as a later
step...

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-10-06 09:34:26 Re: meson PGXS compatibility
Previous Message Drouvot, Bertrand 2022-10-06 08:43:57 Re: Query Jumbling for CALL and SET utility statements