Re: Better shared data structure management and resizable shared data structures

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com
Subject: Re: Better shared data structure management and resizable shared data structures
Date: 2026-04-04 23:17:29
Message-ID: CAEze2WjgCROMMXY0+j8FFdm3iFcr7By-+6Mwiz=PgGSEydiW3A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 4 Apr 2026 at 19:32, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> On 04/04/2026 15:00, Matthias van de Meent wrote:
> > On Sat, 4 Apr 2026 at 02:45, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> >>>> I don't understand the use of ShmemStructDesc. They generally/always
> >>>> are private to request_fn(), and their fields are used exclusively
> >>>> inside the shmem mechanisms, with no reads of its fields that can't
> >>>> already be deduced from context. Why do we need that struct
> >>>> everywhere?
> >>>
> >>> My resizable shared memory structure patches use it as a handle to the
> >>> structure to be resized.
> >>
> >> Right. And hash tables and SLRUs use a desc-like object already, so for
> >> symmetry it feels natural to have it for plain structs too.
> >> I wonder if we should make it optional though, for the common case that
> >> you have no intention of doing anything more with the shmem region that
> >> you'd need a desc for. I'm thinking you could just pass NULL for the
> >> desc pointer:
> >>
> >> ShmemRequestStruct(NULL,
> >> .name = "pg_stat_statements",
> >> .size = sizeof(pgssSharedState),
> >> .ptr = (void **) &pgss,
> >> };
> >
> > That would help, though I'd still wonder why we'd have separate Opts
> > and Desc structs. IIUC, they generally carry (exactly) the same data.
> >
> > Maybe moving it into a `.handle` or `.desc` field in Shmem*Opts could
> > make that part of the code a bit cleaner; as it'd further clarify that
> > it's very much an optional field.
>
> Yeah. OTOH, I'd like to separate the options from what's effectively a
> return value. But maybe you're right and it's nevertheless better that way.
>
> Some options on this:
>
> a) What's in the patch now
[...]
> b) Allow passing NULL for the desc
[...]
> c) Return the Desc as a return value
[...]
> In option c) you can just throw away the result if you don't need it. I
> kind of like this as a notational thing. However it has some downsides:
>
> This changes the return value to be a pointer. I'm thinking that
> ShmemRequestStruct() palloc's the descriptor struct in TopMemoryContext.
> This is a little ugly because the descriptor struct is leaked if the
> caller throws it away. It's not a lot of memory, but still.

Yeah, it'd be bad if we'd leak it, as it could cause some
semipermanent memory leaks when the server keeps restarting after
crash without resetting TopMemoryContext.

> d) Make it part of Opts, as you suggested
[...]
> In the attached new version, though, I stepped back and decided to
> remove the whole ShmemStructDesc after all. I still think having a
> handle like that is a good idea, and the follow-up patches for resizing
> need it. However, with option d) it can easily be added later. With
> option d), it seems silly to have it be part of the patch now, when the
> desc struct doesn't really do anything.

Thanks!

> Other changes in this patch version:
>
> - I moved some of the stuff from shmem.h to a new shmem_internal.h
> header. The idea is that what remains in shmem.h provides the public API
> for allocating shared memory.
>
> - I refactored the "after-startup request" code. It now detects the case
> that some of the shmem areas, but not all, have already been initialized
> and throws an error.
>
> Still processing the rest of the feedback from the past days. This patch
> version is also available at
> https://github.com/hlinnaka/postgres/tree/shmem-init-refactor-11.

Thanks!

> On Sat, 4 Apr 2026 at 02:49, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > Those are now committed, and here's a new version rebased over those
> > changes. The hash options is now called 'nelems', and the 'extra_size'
> > in ShmemStructOpts is gone.
> >
> > Plus a bunch of other fixes and cleanups. I also reordered and
> > re-grouped the patches a little, into more logical increments I hope.
>
> 0004-0014: TBD

Review continued, based on v11.

0004: LGTM, with some nits:

> + * This is called at postmaster startup. Note that the shared memory isn't
> + * allocated here yet, this merely register our needs.

Typo: register -> registers

Formatting:
> + ShmemRequestHash(.name = "pg_stat_statements hash",
> + .nelems = pgss_max,
> + .hash_info.keysize = sizeof(pgssHashKey),
> + .hash_info.entrysize = sizeof(pgssEntry),
> + .hash_flags = HASH_ELEM | HASH_BLOBS,
> + .ptr = &pgss_hash,
> + );
(note that additional unit of indentation for the closing bracket)

Is this malformatting caused by pgindent? If so, could you see if
there's a better way of defining ShmemRequestHash/Struct that doesn't
have this indent as output?

> + pgss->extent = 0;
> + pgss->n_writers = 0;
> + pgss->gc_count = 0;
> + pgss->stats.dealloc = 0;

Shmem is said to be zero-initialized, should we remove the manual
zero-initialization?

> + on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
See my upthread comment about adding optional on_shmem_exit callbacks
to ShmemCallbacks.

0005: LGTM

0006: I don't think it is a great idea to make the LwLock machinery
the first to get allocation requests:
It has the RequestNamedLWLockTranche infrastructure, which can only
register new requests while process_shmem_requests_in_progress, and
making it request its memory ahead of everything else is likely to
cause an undersized tranche to be allocated. You could make sure that
this isn't an issue by maintaining a flag in lwlock.c that's set when
the shmem request is made (and reset on shmem exit), which must be
false when RequestNamedLWLockTranche() is called, and if not then it
should throw an error.

0007: LGTM, Nits:

Patch description: ProgGlobal -> ProcGlobal

> %_sema.c
Not my favourite pieces of code, but that's not your patch's fault. To
me, it seems this code area has too much duplication, but that's not
something you have to fix.

0008: LGTM

> +#ifdef USE_ASSERT_CHECKING
> + SerialPagePrecedesLogicallyUnitTests();
> +#endif

Huh, interesting. I hadn't seen such inline unit testing before.
Mailing list history seems to agree with this, so, TIL.

0009: LGTM

0010: Not looked at everything yet, but a few comment:

> +++ b/src/include/access/slru.h

With the changes in the signatures for most/all SLRU functions from a
hidden-by-typedef pointer to a visible pointer type, maybe this could
be an opportunity to swap them to `const SlruDesc *ctl` wherever
possible? I don't think there are many backend-local changes that
happen to SlruDescs once we've properly started the backend. I'm happy
to provide an incremental patch if you'd like me to spend cycles on it
if you're busy.

> +++ b/src/backend/access/transam/clog.c

> + SimpleLruRequest(.desc = &XactSlruDesc,
> + .name = "transaction",
> + .Dir = "pg_xact",
> + .long_segment_names = false,
> +
> + .nslots = CLOGShmemBuffers(),
> + .nlsns = CLOG_LSNS_PER_PAGE,
> +
> + .sync_handler = SYNC_HANDLER_CLOG,
> + .PagePrecedes = CLOGPagePrecedes,
> + .errdetail_for_io_error = clog_errdetail_for_io_error,

That awfully inconsistent field name styling is ... awful, but not
this patch's fault. If something can be done about it in a cheap
fashion in this patch, that'd be great, but I won't hold it against
you if that's skipped.

> +++ b/src/backend/access/transam/multixact.c

> static void
> MultiXactShmemRequest(void *arg)
> [...]
> + /*
> + * members SLRU doesn't call SimpleLruTruncate() or meet criteria for unit
> + * tests
> + */
I think this comment is misplaced, it should probably be put in
MultiXactShmemInit(), below MultiXactOffset's UnitTests (which is just
a few lines below its current location).

The rest of 0010; all of 0011-0014: TBD

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2026-04-04 23:27:34 Re: PG 19 release notes and authors
Previous Message Robert Haas 2026-04-04 22:52:21 Re: pg_plan_advice