Re: [PoC] Improve dead tuple storage for lazy vacuum

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Date: 2023-12-12 02:53:08
Message-ID: CANWCAZY46dP5=igG1f3REPq32BqQsKCV19YS-UZy3D309F=qkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 11, 2023 at 1:18 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

> I've attached the updated patch set. From the previous patch set, I've
> merged patches 0007 to 0010. The other changes such as adding RT_GET()
> still are unmerged for now, for discussion. Probably we can make them
> as follow-up patches as we discussed. 0011 to 0015 patches are new
> changes for v44 patch set, which removes RT_SEARCH() and RT_SET() and
> support variable-length values.

This looks like the right direction, and I'm pleased it's not much
additional code on top of my last patch.

v44-0014:

+#ifdef RT_VARLEN_VALUE
+ /* XXX: need to choose block sizes? */
+ tree->leaf_ctx = AllocSetContextCreate(ctx,
+ "radix tree leaves",
+ ALLOCSET_DEFAULT_SIZES);
+#else
+ tree->leaf_ctx = SlabContextCreate(ctx,
+ "radix tree leaves",
+ RT_SLAB_BLOCK_SIZE(sizeof(RT_VALUE_TYPE)),
+ sizeof(RT_VALUE_TYPE));
+#endif /* RT_VARLEN_VALUE */

Choosing block size: Similar to what we've discussed previously around
DSA segments, we might model this on CreateWorkExprContext() in
src/backend/executor/execUtils.c. Maybe tid store can pass maint_w_m /
autovac_w_m (later work_mem for bitmap scan). RT_CREATE could set the
max block size to 1/16 of that, or less.

Also, it occurred to me that compile-time embeddable values don't need
a leaf context. I'm not sure how many places assume that there is
always a leaf context. If not many, it may be worth not creating one
here, just to be tidy.

+ size_t copysize;

- memcpy(leaf.local, value_p, sizeof(RT_VALUE_TYPE));
+ copysize = sizeof(RT_VALUE_TYPE);
+#endif
+
+ memcpy(leaf.local, value_p, copysize);

I'm not sure this indirection adds clarity. I guess the intent was to
keep from saying "memcpy" twice, but now the code has to say "copysize
= foo" twice.

For varlen case, we need to watch out for slowness because of memcpy.
Let's put that off for later testing, though. We may someday want to
avoid a memcpy call for the varlen case, so let's keep it flexible
here.

v44-0015:

+#define SizeOfBlocktableEntry (offsetof(

Unused.

+ char buf[MaxBlocktableEntrySize] = {0};

Zeroing this buffer is probably going to be expensive. Also see this
pre-existing comment:
/* WIP: slow, since it writes to memory for every bit */
page->words[wordnum] |= ((bitmapword) 1 << bitnum);

For this function (which will be vacuum-only, so we can assume
ordering), in the loop we can:
* declare the local bitmapword variable to be zero
* set the bits on it
* write it out to the right location when done.

Let's fix both of these at once.

+ if (TidStoreIsShared(ts))
+ shared_rt_set(ts->tree.shared, blkno, (void *) page, page_len);
+ else
+ local_rt_set(ts->tree.local, blkno, (void *) page, page_len);

Is there a reason for "void *"? The declared parameter is
"RT_VALUE_TYPE *value_p" in 0014.
Also, since this function is for vacuum (and other uses will need a
new function), let's assert the returned bool is false.

Does iteration still work? If so, it's not too early to re-wire this
up with vacuum and see how it behaves.

Lastly, my compiler has a warning that CI doesn't have:

In file included from ../src/test/modules/test_radixtree/test_radixtree.c:121:
../src/include/lib/radixtree.h: In function ‘rt_find.isra’:
../src/include/lib/radixtree.h:2142:24: warning: ‘slot’ may be used
uninitialized [-Wmaybe-uninitialized]
2142 | return (RT_VALUE_TYPE*) slot;
| ^~~~~~~~~~~~~~~~~~~~~
../src/include/lib/radixtree.h:2112:23: note: ‘slot’ was declared here
2112 | RT_PTR_ALLOC *slot;
| ^~~~

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-12-12 03:31:03 Improve eviction algorithm in ReorderBuffer
Previous Message Hayato Kuroda (Fujitsu) 2023-12-12 02:43:26 RE: [Proposal] Add foreign-server health checks infrastructure