Re: Better shared data structure management and resizable shared data structures

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com
Subject: Re: Better shared data structure management and resizable shared data structures
Date: 2026-04-04 13:51:09
Message-ID: CAEze2Whd_cB4oZwZq-Ar1KgUGiA_gZPPWOvUcw_3LC7sJJB67w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 4 Apr 2026 at 02:49, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Those are now committed, and here's a new version rebased over those
> changes. The hash options is now called 'nelems', and the 'extra_size'
> in ShmemStructOpts is gone.
>
> Plus a bunch of other fixes and cleanups. I also reordered and
> re-grouped the patches a little, into more logical increments I hope.

0001: LGTM

0002:

> +++ b/src/backend/storage/ipc/shmem.c

> + * Nowadays, there is also a third way to allocate shared memory called

There's no clear indicator of the second way to allocate shared
memory, nor is the first one clearly defined in the new verson of the
comment block.

> + * item is deleted. However, if one hash table grows very large and then
> + * shrinks, its space cannot be redistributed to other tables. We could build
> + * a simple hash bucket garbage collector if need be. Right now, it seems
> + * unnecessary.

I think this new text is outdated, given that we don't have growing
hash tables anymore.

I also think it should've referred to elements, not buckets;
dynahash's buckets cannot readily be deallocated as they're generally
always "in use" (they might be NULL, but they're still accessed in
read operations on missing keys). Elements are put in the freelist if
not used, and those could be released into a memory pool if so desired
(and coded).

> + * In builtin PostgreSQL code, add the callbacks to the list in
> + * src/include/storage/subsystemlist.h.

This refers to an automation system that's introduced a few commits
later, in commit 0005, and therefore probably should be added only in
that commit.

> + * Legacy ShmemInitStruct()/ShmemInitHash() functions
> + * --------------------------------------------------

Should we have checks in place to avoid calls to new APIs from old
callbacks, and vice versa?

> ShmemRequestInternal(...
> + ShmemRequest *request;
[...]
> + foreach_ptr(ShmemStructDesc, existing, pending_shmem_requests)
[...]
> + request = palloc(sizeof(ShmemRequest));
[...]
> + pending_shmem_requests = lappend(pending_shmem_requests, request);

It looks like you missed my earlier comment about type confusion.
Here, pending_shmem_requests is a List of ShmemRequest pointers, while
the foreach_ptr() uses ShmemStructDesc, which is a type confusion. The
loop checks the 'char *name' field of ShmemStructDesc, which in a
ShmemRequest is the 'ShmemStructDesc *desc'. This bug would cause
issues if different ShmemStructDescs are registered by the same name,
as the ShmemStructDescs wouldn't (necessarily) be strcmp()-equal for
the same name.

> ShmemAttachRequested(void)
> + /* Call attach callbacks */
> + foreach(lc, registered_shmem_callbacks)
> + {
> + const ShmemCallbacks *callbacks = (const ShmemCallbacks *) lfirst(lc);

This would be more concise with foreach_ptr(const ShmemCallbacks,
callbacks, registered_shmem_callbacks), like in ShmemInitRequested.

> +++ b/src/include/storage/shmem.h
> +/*
> + * Shared memory is reserved and allocated in stages at postmaster startup,
> + * and in EXEC_BACKEND mode, there's some extra work done to "attach" to them
> + * at backend startup. ShmemCallbacks holds callback functions that are
> + * called at different stages.
> + */
> +typedef struct ShmemCallbacks

Maybe this should also have the opportunity for a (before_)shmem_exit callback?

> + * on-demaind in a backend. If a subsystem sets this flag, the callbacks are
> + * called immediately after registration, to initialize or attach to the
> + * requested shared memory areas.

Ideally we only immediately call the callbacks if we're under
postmaster, or in a standalone backend; we shouldn't allocate shmem
for some preloaded libraries that set this flag, at least not ahead of
loading all preload libraries.

0003: Maybe this could also test that the protections we're putting in
place against double-registration of shmem areas actually detect the
duplication issue?
Otherwise, LGTM

0004-0014: TBD

While it's mostly mechanical changes, it did make me notice the rather
annoying allocation patterns by XLOGShmemRequest. It allocates various
types of data in one go (which, in principle, is fine) but in doing so
it adds its own alignment tricks etc, and I'm not super stoked about
that. If time allows, could we clean that up?

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2026-04-04 13:52:34 Re: vectorized CRC on ARM64
Previous Message Sami Imseih 2026-04-04 13:34:38 Re: remove autoanalyze corner case