Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2023-01-17 11:05:59
Message-ID: CAFBsxsHkjxoJjQhU_Oseon+XPofoi4OskCTW9p2ybXzBYubBrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 16, 2023 at 3:18 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:
>
> On Mon, Jan 16, 2023 at 2:02 PM John Naylor
> <john(dot)naylor(at)enterprisedb(dot)com> wrote:

Attached is an update that mostly has the modest goal of getting CI green
again. v19-0003 has squashed the entire radix tree template from
previously. I've kept out the perf test module for now -- still needs
updating.

> > [05:44:11.819] test_radixtree.c.obj : error LNK2001: unresolved
> > external symbol pg_popcount64
> > [05:44:11.819] src\test\modules\test_radixtree\test_radixtree.dll :
> > fatal error LNK1120: 1 unresolved externals
>
> Yeah, I'm not sure what's causing that. Since that comes from a debugging
function, we could work around it, but it would be nice to understand why,
so I'll probably have to experiment on my CI repo.

I'm still confused by this error, because it only occurs in the test
module. I successfully built with just 0002 in CI so elsewhere where bmw_*
symbols resolve just fine on all platforms. I've worked around the error in
v19-0004 by using the general-purpose pg_popcount() function. We only need
to count bits in assert builds, so it doesn't matter a whole lot.

> + /* XXX: do we need to set a callback on exit to detach dsa? */
>
> In the current shared radix tree design, it's a caller responsible
> that they create (or attach to) a DSA area and pass it to RT_CREATE()
> or RT_ATTACH(). It enables us to use one DSA not only for the radix
> tree but also other data. Which is more flexible. So the caller needs
> to detach from the DSA somehow, so I think we don't need to set a
> callback here for that.
>
> ---
> + dsa_free(tree->dsa, tree->ctl->handle); // XXX
> + //dsa_detach(tree->dsa);
>
> Similar to above, I think we should not detach from the DSA area here.
>
> Given that the DSA area used by the radix tree could be used also by
> other data, I think that in RT_FREE() we need to free each radix tree
> node allocated in DSA. In lazy vacuum, we check the memory usage
> instead of the number of TIDs and need to reset the TidScan after an
> index scan. So it does RT_FREE() and dsa_trim() to return DSM segments
> to the OS. I've implemented rt_free_recurse() for this purpose in the
> v15 version patch.
>
> --
> - Assert(tree->root);
> + //Assert(tree->ctl->root);
>
> I think we don't need this assertion in the first place. We check it
> at the beginning of the function.

I've removed these in v19-0006.

> > That sounds like a good idea. It's also worth wondering if we even need
RT_NUM_ENTRIES at all, since the caller is capable of keeping track of that
if necessary. It's also misnamed, since it's concerned with the number of
keys. The vacuum case cares about the number of TIDs, and not number of
(encoded) keys. Even if we ever (say) changed the key to blocknumber and
value to Bitmapset, the number of keys might not be interesting.
>
> Right. In fact, TIdStore doesn't use RT_NUM_ENTRIES.

I've moved it to the test module, which uses it extensively. There, the
name is more clear what it's for, so I didn't change the name.

> > It sounds like we should at least make the delete functionality
optional. (Side note on optional functions: if an implementation didn't
care about iteration or its order, we could optimize insertion into linear
nodes)
>
> Agreed.

Done in v19-0007.

v19-0009 is just a rebase over some more vacuum cleanups.

I'll continue working on internals cleanup.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v19-0001-introduce-vector8_min-and-vector8_highbit_mask.patch text/x-patch 2.6 KB
v19-0005-Remove-RT_NUM_ENTRIES.patch text/x-patch 2.7 KB
v19-0004-Workaround-link-errors-on-Windows-CI.patch text/x-patch 1.1 KB
v19-0002-Move-some-bitmap-logic-out-of-bitmapset.c.patch text/x-patch 6.1 KB
v19-0003-Add-radixtree-template.patch text/x-patch 107.9 KB
v19-0009-Use-TIDStore-for-storing-dead-tuple-TID-during-l.patch text/x-patch 34.6 KB
v19-0006-Shared-memory-cleanups.patch text/x-patch 1.3 KB
v19-0007-Make-RT_DELETE-optional.patch text/x-patch 3.7 KB
v19-0008-Add-TIDStore-to-store-sets-of-TIDs-ItemPointerDa.patch text/x-patch 23.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wangw.fnst@fujitsu.com 2023-01-17 11:15:38 RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message Takamichi Osumi (Fujitsu) 2023-01-17 11:00:11 RE: Time delayed LR (WAS Re: logical replication restrictions)