| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com |
| Subject: | Re: Better shared data structure management and resizable shared data structures |
| Date: | 2026-04-05 11:20:28 |
| Message-ID: | CAExHW5u=zdSntmd9hhrMFDQP5n15Rz+JMRYHyo1FNwXE5LsuDg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, Apr 5, 2026 at 2:36 PM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
>
> On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
>
> I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
> its platforms to make it part of the exposed ABI for Shmem? I think
> that we should expose the same functions and structs, and just have
> the shmem internals throw an error if the configuration used by the
> user implies the user wants to update shmem sizing when the system
> doesn't support it. That would avoid extensions having to recompile
> between have/have not systems that have an otherwise compatible ABI;
> especially when those extensions don't actually need the resizeable
> part of the shmem system.
>
I don't think I understand this fully. An extension may want to
support a structure in both modes - fixed as well as resizable
depending upon whether the latter is supported. If the structure has
maximum_size always the extension code needs to set it to 0 when the
resizable shared structure is not supported and set to actual
maximum_size when the resizable structure is supported. Without a
macro or some flag they can not do that. The flag/macro then becomes
part ABI for shmem. Am I correct? Since extension binaries need to be
built on different platforms anyway, that would automatically take
care of building with or without HAVE_RESIZABLE_SHMEM. I feel it makes
testing simpler since run time behaviour is fixed. Maybe I am missing
something. Maybe a code diff or some example platform might make it
more clear for me.
> > Following points are up for discussion
> > =============================
>
> Nit: I think "reserved"/"space_reserved" is a better descriptor than
> "allocated_space", as "allocated_space" could reasonably imply the
> memory isn't available to the OS.
Renamed it to reserved_space. New name is also less confusing with
allocation_size.
>
> Note that currently, your patch rejects the case where resizeable
> structs are initialized at their maximum size:
>
> > +++ b/src/backend/storage/ipc/shmem.c
>
> > +#ifdef HAVE_RESIZABLE_SHMEM
> > + if (options->maximum_size > 0 && options->size >= options->maximum_size)
> > + elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
> > + options->name, options->maximum_size, options->size);
>
> It'd need to check 'options->size > options->maximum_size' to allow
> max-sized initialization to succeed here without erroring.
good catch. FIxed in the attached patch.
>
> > But if we do so, we need another member in
> > ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
> > resizable or not. Instead the patches set maximum_size to 0 for
> > fixed-size structures and non-zero for resizable structures. This way
> > we can check whether a structure is resizable or not by checking
> > whether its maximum_size is zero or not. pg_shmem_allocations view
> > also has a maximum_size column which has the similar characteristics.
> > I would like to know what others think.
>
> I think that shmem allocations can set
>
> .size for the initial size, and
> .minimum_size/.maximum_size for configuring resizeability;
>
> The latter fields can then be initialized with .size if they're 0.
>
>
> > 3. allocated_space member in various structures and in pg_shmem_allocations view
> > -------------------------------------------------------------------------------
> > The patch adds a new member allocated_space to ShmemIndexEnt and
> > pg_shmem_allocations view. allocated_space to maximum_size is what
> > allocated_size is to size - it's the type aligned value of
> > maximum_size. But it also highlights the difference between the
> > address space allocation and the actual memory allocation. This
> > difference is crucial to resizable structures. However, unlike
> > maximum_size, we set it to a non-zero value, allocated_size, for
> > fixed-size structures as well since they are allocated the same amount
> > of space as their allocated_size. While this seems logically correct
> > to me, some may find maximum_size to be zero but allocated_space to be
> > non-zero for fixed-size structures a bit weird. I would like to know
> > what others think.
>
> I'd prefer to have consistent values; constant-sized structs are no
> different from resizable structs whose min/max size equal their
> current size. The only alternative that I think could be considered
> correct is returning NULL for those, but zero is definitely wrong.
>
> Note that returning min/max=size would also allow for better
> aggregations on pg_shmem_allocations columns.
>
> Note: if we expose minimum_size, we may also want to expose
> min_allocated_size (i.e., the full reservation minus the size of
> MADV_REMOVEd pages when the shmem allocation is min-sized).
>
> > As a side question, do we want to allow users to specify minimum_size
> > in ShmemStructOpts for resizable structures? Resizing memory lower
> > than that would be prohibited. For fixed sized structures,
> > minimum_size would be same as size and also maximum_size.
>
> I think it would be useful, if only to inform users and developers
> about this in e.g. pg_shmem_allocations.
>
> > For now, it
> > seems only for the sanity checks, but it could be seen as a useful
> > safety feature. A difference in maximum_size and minimum_size would
> > indicate that the structure is resizable.
>
> I think that's the right approach.
I also think that introducing minimum_size is useful. Let's hear from
Heikki before implementing it, in case he has a different opinion. I
am not sure about min_allocated_space though - what use do you see for
it. reserved_space is useful in pg_shmem_allocations() C function
itself and gives impact to the fully grown structure. What would
min_allocated_space give us? If at all it would be min_allocated_size
not space since reserved space will never change. But even that I am
not sure about.
>
> > 4. to mprotect or not to mprotect
> > ---------------------------------
> > If memory beyond the current size of a resizable structure is
> > accessed, it won't cause any segfault or bus error. When writing
> > memory will be simply allocated and when reading, it will return
> > zeroes if memory is not allocated yet. mprotect'ing the memory beyond
> > the current size of a resizable structure to PROT_NONE can prevent
> > accidental access to unallocated memory (sans page boundaries), but it
> > needs to be done in every backend process which requires a
> > synchronization mechanism beyond the scope of shmem.c. Hence the patch
> > does not use mprotect.
>
> It seems to me that the synchronization is a crucial component of
> resizing; isn't it bad if shmem structs can suddenly without
> synchronization contain zeroes?
>
> > A subsystem will require some higher level
> > synchronization mechanism between users of the structure and the
> > process which resizes it. That synchronization mechanism can be used
> > to mprotect the memory, if required. I have documented this, but I
> > would like to know whether we should provide an API in shmem.c to
> > mprotect.
>
> I think we should; I think it would simplify and deduplicate external
> code that needs to mark the pages PROT_NONE, and centralize OS page
> calculations to within the shmem subsystem.
> It'd also allow checks that validate that the pages marked with
> PROT_NONE are 1) within a shmem allocation and 2) currently not in use
> by that shmem allocation.
Reasonable. Let's wait for Heikki's opinion on this as well before
implementing it.
>
> (Was there a point 5. for discussion? I can't find it)
There is no point 5, just bad numbering.
>
> (This is where I ran out of time for these questions, sorry I didn't
> get to point 6)
CFBot did show some failures.
1. Makefile didn't define PGXS, fixed it.
2. Windows compiler didn't like #ifdef in the middle of function like
macro argument list C5101 and C2059. Used a conditional macro instead.
3. The test fails one one machine because RssShmem is consistently 8MB
higher than the allocated_size in all cases. I guess it is because of
huge page setting. Adding huge_pages = off to the test configuration.
I think the test can not rely on huge pages anyway since
allocated_size isn't aligned to huge page size.
--
Best Wishes,
Ashutosh Bapat
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-04-05 11:47:30 | Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE |
| Previous Message | Etsuro Fujita | 2026-04-05 11:05:42 | Re: Import Statistics in postgres_fdw before resorting to sampling. |