Re: Better shared data structure management and resizable shared data structures

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com
Subject: Re: Better shared data structure management and resizable shared data structures
Date: 2026-03-27 23:17:13
Message-ID: b2451225-0a10-4ded-abef-4e9efbde7a8e@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks, will incorporate your comments in next version. Replying to just
a few of them here:

On 27/03/2026 09:01, Ashutosh Bapat wrote:
> /* Restore basic shared memory pointers */
> if (UsedShmemSegAddr != NULL)
> + {
> InitShmemAllocator(UsedShmemSegAddr);
> + ShmemCallRequestCallbacks();
>
> It's not clear how we keep the list of registered callbacks across the
> backends and also after restart in-sync. How do we make sure that the
> callbacks registered at this time are the same callbacks registered
> before creating the shared memory? How do we make sure that the
> callbacks registered after the startup are also registered after
> restart?

On Unix systems, the registered callbacks are inherited by fork(), and
also survive over crash restart. With EXEC_BACKEND, the assumption is
that calling a library's _PG_init() function will register the same
callbacks every time. We make the same assumption today with the
shmem_startup hook.

> +void
> +RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
> ... snip ...
> + foreach(lc, requested_shmem_areas)
>
> Doesn't this list contain all the areas, not just registered in this
> instance of the call. Does that mean that we need to have all the
> attach functions idempotent? Why can't we deal with the newly
> registered areas only?

registered_shmem_areas is supposed to be empty when the function is
entered. There's an assertion for that too before the foreach().

However, it's missing this, after processing the list:

list_free_deep(requested_shmem_areas);
requested_shmem_areas = NIL;

Because of that, this will fail if you load multiple extensions that
call RegisterShmemCallbacks() in the same session. Will fix that.

> + /*
> + * Extra space to reserve in the shared memory segment, but it's not part
> + * of the struct itself. This is used for shared memory hash tables that
> + * can grow beyond the initial size when more buckets are allocated.
> + */
> + size_t extra_size;
>
> When we introduce resizable structures (where even the hash table
> directly itself could be resizable), we will introduce a new field
> max_size which is easy to get confused with extra_size. Maybe we can
> rename extra_size to something like "auxilliary_size" to mean size of
> the auxiliary parts of the structure which are not part of the main
> struct itself.
>
> + /*
> + * max_size is the estimated maximum number of hashtable entries. This is
> + * not a hard limit, but the access efficiency will degrade if it is
> + * exceeded substantially (since it's used to compute directory size and
> + * the hash table buckets will get overfull).
> + */
> + size_t max_size;
> +
> + /*
> + * init_size is the number of hashtable entries to preallocate. For a
> + * table whose maximum size is certain, this should be equal to max_size;
> + * that ensures that no run-time out-of-shared-memory failures can occur.
> + */
> + size_t init_size;
>
> Everytime I look at these two fields, I question whether those are the
> number of entries (i.e. size of the hash table) or number of bytes
> (size of the memory). I know it's the former, but it indicates that
> something needs to be changed here, like changing the names to have
> _entries instead of _size, or changing the type to int64 or some such.
> Renaming to _entries would conflict with dynahash APIs since they use
> _size, so maybe the latter?

Agreed.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-03-27 23:17:38 Re: another autovacuum scheduling thread
Previous Message Sami Imseih 2026-03-27 23:14:14 Add pg_stat_autovacuum_priority