| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Cc: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com |
| Subject: | Re: Better shared data structure management and resizable shared data structures |
| Date: | 2026-04-05 05:58:51 |
| Message-ID: | CAExHW5stth2mdXh3ukn9rWJ+Pruoat+5tY3CYyM6KGqH2G30fQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Sun, Apr 5, 2026 at 11:18 AM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> I will post my resizable shmem structures patch in a separate email in
> this thread but continue to review your patches.
>
Attached is your patchset (0001 - 0014) + resizable shared memory
structures patchset 0015.
Resizable shared memory structures
============================
When allocating memory to the requested shared structures, we allocate
space for each structure. In mmap'ed shared memory, the memory is
allocated against those structures only when those structures are
initialized.
Resizable shared memory structures are simply allocated maximum space
when that happens. The function which initializes the structure is
expected to initialize only the memory worth its initial size. When
resizing the structure memory is freed or allocated against the
reserved space depending upon the new size. This allows the structures
to be resized while keeping their starting address stable which is a
hard requirement in PostgreSQL.
Resizable shared memory feature depends upon the existence of function
madvise() and constants MADV_REMOVE and MADV_WRITE_POPULATE.
On the platforms which do not have these, we disable this feature at
compile time. The commit introduces a compile time flag
HAVE_RESIZABLE_SHMEM which is defined if MADV_REMOVE and
MADV_WRITE_POPULATE exist. We don't check the existence of madvise
separately, since the existence of the constants implies the existence
of the function.
HAVE_RESIZABLE_SHMEM is not defined in EXEC_BACKEND builds since
that's largely used for Windows where the APIs to free and allocate
memory from and to a given address space are not known to the author
right now. Given that PostgreSQL is used widely on Linux, providing
this feature on Linux covers benefits most of its users. Once we
figure out the required Windows APIs, we will support this feature on
Windows as well.
The feature is also not available when Sys-V shared memory is used
even on Linux since we do not know whether required Sys-V APIs exist;
mostly they don't. Since that combination is only available for
development and testing, not supporting the feature isn't going to
impact PostgreSQL users much.
Using HAVE_RESIZABLE_SHMEM we disable compiling the code related to
resizable shared memory structures on the platforms which do not
support the feature. But we also have run time checks to disable this
feature when Sys-V shared memory is used. In order to know whether a
given instance of a running server supports resizable structures, we
have introduced GUC have_resizable_shmem.
Following points are up for discussion
=============================
1. calculation of allocated_size of resizable structures
--------------------------------
For fixed sized shared memory structures, allocated_size is the size
of the aligned structure. Assuming that the whole structure is
initialized, it is also the memory allocated to the structure. Thus
summing all allocated_size's of the allocations gives a
nearly-accurate (considering page sized allocations) idea of the total
shared memory allocated. For a resizable structure, it's a bit more
complicated. We allocate maximum space required by the structure at
the beginning. At a given point in time, the memory page where the
next structure begins and the page which contains the end of the
structure at that point in time are allocated. The pages in-between
are not allocated. The memory allocated to that structure is the
{maximum size of the structure} - {total size of unallocated pages}. I
think setting allocated_size to the actually allocated memory is more
accurate than {current size of the structure} + {alignment} which does
not reflect the actual memory allocated to the structure. I would like
to know what others think.
2. maximum_size member in various structures and in pg_shmem_allocations view
-----------------------------------------------------------------------------
A resizable structure is requested by specifying non-zero maximum_size
in ShmemStructOpts. It gets copied to the maximum_size member in
ShmemStructDesc, ShmemIndexEnt. The question is for fixed-size
structures what should be the value maximum_size in those structures.
Setting it to the same value as the size member in the respective
structure is logical since their maximum size is the same as their
initial size. But if we do so, we need another member in
ShmemStructDesc and ShmemIndexEnt to indicate whether the structure is
resizable or not. Instead the patches set maximum_size to 0 for
fixed-size structures and non-zero for resizable structures. This way
we can check whether a structure is resizable or not by checking
whether its maximum_size is zero or not. pg_shmem_allocations view
also has a maximum_size column which has the similar characteristics.
I would like to know what others think.
3. allocated_space member in various structures and in pg_shmem_allocations view
-------------------------------------------------------------------------------
The patch adds a new member allocated_space to ShmemIndexEnt and
pg_shmem_allocations view. allocated_space to maximum_size is what
allocated_size is to size - it's the type aligned value of
maximum_size. But it also highlights the difference between the
address space allocation and the actual memory allocation. This
difference is crucial to resizable structures. However, unlike
maximum_size, we set it to a non-zero value, allocated_size, for
fixed-size structures as well since they are allocated the same amount
of space as their allocated_size. While this seems logically correct
to me, some may find maximum_size to be zero but allocated_space to be
non-zero for fixed-size structures a bit weird. I would like to know
what others think.
As a minor point, setting allocated_space to allocated_size makes the
calculations in pg_shmem_allocations() a bit easier. However, that can
be fixed trivially.
As a side question, do we want to allow users to specify minimum_size
in ShmemStructOpts for resizable structures? Resizing memory lower
than that would be prohibited. For fixed sized structures,
minimum_size would be same as size and also maximum_size. For now, it
seems only for the sanity checks, but it could be seen as a useful
safety feature. A difference in maximum_size and minimum_size would
indicate that the structure is resizable.
Considering 2 and 3 together, we have the following options
a. As implemented in patch and clarified in documentation.
b. Set maximum_size to size and allocated_space to allocated_size for
fixed-size structures, but add a new member to indicate whether the
structure is resizable or not.
c. Set maximum_size and allocated_space to zero for fixed-size
structures and explicitly mention it in the documentation.
4. to mprotect or not to mprotect
---------------------------------
If memory beyond the current size of a resizable structure is
accessed, it won't cause any segfault or bus error. When writing
memory will be simply allocated and when reading, it will return
zeroes if memory is not allocated yet. mprotect'ing the memory beyond
the current size of a resizable structure to PROT_NONE can prevent
accidental access to unallocated memory (sans page boundaries), but it
needs to be done in every backend process which requires a
synchronization mechanism beyond the scope of shmem.c. Hence the patch
does not use mprotect. A subsystem will require some higher level
synchronization mechanism between users of the structure and the
process which resizes it. That synchronization mechanism can be used
to mprotect the memory, if required. I have documented this, but I
would like to know whether we should provide an API in shmem.c to
mprotect.
6. Tests
-------
The patch adds a new test module resizable_shmem which tests the
resizable shared memory feature. Also it adds a test case to the
test_shmem module to make sure that the fixed-size shared memory
structures can not be resized. I think the resizable_shmem module
should be merged into test_shmem. But I have kept these two separate
for ease of review. Please let me know if you also think they should
be merged.
I have self-reviewed the tests a few times, fixing issues and
adjusting the test and module code. But it could help with some more
review. However, I wanted to get the patch out for review, given the
looming deadline. Similarly for the commit message.
I am adding this to CF so that it gets some CI coverage especially on
the platforms which do not support resizable shared memory.
--
Best Wishes,
Ashutosh Bapat
| From | Date | Subject | |
|---|---|---|---|
| Next Message | ChenhuiMo | 2026-04-05 06:58:35 | 回复:[PATCH] Optimize numeric comparisons and aggregations via packed-datum extraction |
| Previous Message | Ashutosh Bapat | 2026-04-05 05:48:10 | Re: Better shared data structure management and resizable shared data structures |