| From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
|---|---|
| To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com |
| Subject: | Re: Better shared data structure management and resizable shared data structures |
| Date: | 2026-04-05 09:06:27 |
| Message-ID: | CAEze2WiD7m+A+3OrAK0JU265BZ4P48_AuKY-X5siJ=294tBeDQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, 5 Apr 2026, 07:59 Ashutosh Bapat, <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > I will post my resizable shmem structures patch in a separate email in
> > this thread but continue to review your patches.
> >
>
> Attached is your patchset (0001 - 0014) + resizable shared memory
> structures patchset 0015.
>
> Resizable shared memory structures
> ============================
>
> When allocating memory to the requested shared structures, we allocate
> space for each structure. In mmap'ed shared memory, the memory is
> allocated against those structures only when those structures are
> initialized.
> Resizable shared memory structures are simply allocated maximum space
> when that happens. The function which initializes the structure is
> expected to initialize only the memory worth its initial size. When
> resizing the structure memory is freed or allocated against the
> reserved space depending upon the new size. This allows the structures
> to be resized while keeping their starting address stable which is a
> hard requirement in PostgreSQL.
>
> Resizable shared memory feature depends upon the existence of function
> madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
>
> On the platforms which do not have these, we disable this feature at
> compile time. The commit introduces a compile time flag
> HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
> MADV_WRITE_POPULATE exist. We don't check the existence of madvise
> separately, since the existence of the constants implies the existence
> of the function.
>
> HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since
> that's largely used for Windows where the APIs to free and allocate
> memory from and to a given address space are not known to the author
> right now. Given that PostgreSQL is used widely on Linux, providing
> this feature on Linux covers benefits most of its users. Once we
> figure out the required Windows APIs, we will support this feature on
> Windows as well.
>
> The feature is also not available when Sys-V shared memory is used
> even on Linux since we do not know whether required Sys-V APIs exist;
> mostly they don't. Since that combination is only available for
> development and testing, not supporting the feature isn't going to
> impact PostgreSQL users much.
>
> Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
> resizable shared memory structures on the platforms which do not
> support the feature. But we also have run time checks to disable this
> feature when Sys-V shared memory is used. In order to know whether a
> given instance of a running server supports resizable structures, we
> have introduced GUC have_resizable_shmem.
I'm not opposed to HAVE_RESIZABLE_SHMEM, but is it universal enough on
its platforms to make it part of the exposed ABI for Shmem? I think
that we should expose the same functions and structs, and just have
the shmem internals throw an error if the configuration used by the
user implies the user wants to update shmem sizing when the system
doesn't support it. That would avoid extensions having to recompile
between have/have not systems that have an otherwise compatible ABI;
especially when those extensions don't actually need the resizeable
part of the shmem system.
> Following points are up for discussion
> =============================
>
> 1. calculation of allocated_size of resizable structures
> --------------------------------
> The memory allocated to that structure is the
> {maximum size of the structure} - {total size of unallocated pages}. I
> think setting allocated_size to the actually allocated memory is more
> accurate than {current size of the structure} + {alignment} which does
> not reflect the actual memory allocated to the structure. I would like
> to know what others think.
I agree: For allocated_size, it should be the max size of the
structure (+alignment, if any), minus the total size of its
deallocated pages.
Nit: I think "reserved"/"space_reserved" is a better descriptor than
"allocated_space", as "allocated_space" could reasonably imply the
memory isn't available to the OS.
> 2. maximum_size member in various structures and in pg_shmem_allocations view
> -----------------------------------------------------------------------------
> A resizable structure is requested by specifying non-zero maximum_size
> in ShmemStructOpts. It gets copied to the maximum_size member in
> ShmemStructDesc, ShmemIndexEnt. The question is for fixed-size
> structures what should be the value maximum_size in those structures.
> Setting it to the same value as the size member in the respective
> structure is logical since their maximum size is the same as their
> initial size.
Note that currently, your patch rejects the case where resizeable
structs are initialized at their maximum size:
> +++ b/src/backend/storage/ipc/shmem.c
> +#ifdef HAVE_RESIZABLE_SHMEM
> + if (options->maximum_size > 0 && options->size >= options->maximum_size)
> + elog(ERROR, "resizable shared memory structure \"%s\" should have maximum size (%zd) greater than size (%zd)",
> + options->name, options->maximum_size, options->size);
It'd need to check 'options->size > options->maximum_size' to allow
max-sized initialization to succeed here without erroring.
> But if we do so, we need another member in
> ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
> resizable or not. Instead the patches set maximum_size to 0 for
> fixed-size structures and non-zero for resizable structures. This way
> we can check whether a structure is resizable or not by checking
> whether its maximum_size is zero or not. pg_shmem_allocations view
> also has a maximum_size column which has the similar characteristics.
> I would like to know what others think.
I think that shmem allocations can set
.size for the initial size, and
.minimum_size/.maximum_size for configuring resizeability;
The latter fields can then be initialized with .size if they're 0.
> 3. allocated_space member in various structures and in pg_shmem_allocations view
> -------------------------------------------------------------------------------
> The patch adds a new member allocated_space to ShmemIndexEnt and
> pg_shmem_allocations view. allocated_space to maximum_size is what
> allocated_size is to size - it's the type aligned value of
> maximum_size. But it also highlights the difference between the
> address space allocation and the actual memory allocation. This
> difference is crucial to resizable structures. However, unlike
> maximum_size, we set it to a non-zero value, allocated_size, for
> fixed-size structures as well since they are allocated the same amount
> of space as their allocated_size. While this seems logically correct
> to me, some may find maximum_size to be zero but allocated_space to be
> non-zero for fixed-size structures a bit weird. I would like to know
> what others think.
I'd prefer to have consistent values; constant-sized structs are no
different from resizable structs whose min/max size equal their
current size. The only alternative that I think could be considered
correct is returning NULL for those, but zero is definitely wrong.
Note that returning min/max=size would also allow for better
aggregations on pg_shmem_allocations columns.
Note: if we expose minimum_size, we may also want to expose
min_allocated_size (i.e., the full reservation minus the size of
MADV_REMOVEd pages when the shmem allocation is min-sized).
> As a side question, do we want to allow users to specify minimum_size
> in ShmemStructOpts for resizable structures? Resizing memory lower
> than that would be prohibited. For fixed sized structures,
> minimum_size would be same as size and also maximum_size.
I think it would be useful, if only to inform users and developers
about this in e.g. pg_shmem_allocations.
> For now, it
> seems only for the sanity checks, but it could be seen as a useful
> safety feature. A difference in maximum_size and minimum_size would
> indicate that the structure is resizable.
I think that's the right approach.
> 4. to mprotect or not to mprotect
> ---------------------------------
> If memory beyond the current size of a resizable structure is
> accessed, it won't cause any segfault or bus error. When writing
> memory will be simply allocated and when reading, it will return
> zeroes if memory is not allocated yet. mprotect'ing the memory beyond
> the current size of a resizable structure to PROT_NONE can prevent
> accidental access to unallocated memory (sans page boundaries), but it
> needs to be done in every backend process which requires a
> synchronization mechanism beyond the scope of shmem.c. Hence the patch
> does not use mprotect.
It seems to me that the synchronization is a crucial component of
resizing; isn't it bad if shmem structs can suddenly without
synchronization contain zeroes?
> A subsystem will require some higher level
> synchronization mechanism between users of the structure and the
> process which resizes it. That synchronization mechanism can be used
> to mprotect the memory, if required. I have documented this, but I
> would like to know whether we should provide an API in shmem.c to
> mprotect.
I think we should; I think it would simplify and deduplicate external
code that needs to mark the pages PROT_NONE, and centralize OS page
calculations to within the shmem subsystem.
It'd also allow checks that validate that the pages marked with
PROT_NONE are 1) within a shmem allocation and 2) currently not in use
by that shmem allocation.
(Was there a point 5. for discussion? I can't find it)
(This is where I ran out of time for these questions, sorry I didn't
get to point 6)
Kind regards,
Matthias van de Meent
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Florents Tselai | 2026-04-05 09:15:42 | Re: Patch: Add tsmatch JSONPath operator for granular Full Text Search |
| Previous Message | Masahiko Sawada | 2026-04-05 08:03:00 | Re: Introduce XID age based replication slot invalidation |