| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, chaturvedipalak1911(at)gmail(dot)com |
| Subject: | Re: Better shared data structure management and resizable shared data structures |
| Date: | 2026-03-27 23:17:13 |
| Message-ID: | b2451225-0a10-4ded-abef-4e9efbde7a8e@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Thanks, will incorporate your comments in next version. Replying to just
a few of them here:
On 27/03/2026 09:01, Ashutosh Bapat wrote:
> /* Restore basic shared memory pointers */
> if (UsedShmemSegAddr != NULL)
> + {
> InitShmemAllocator(UsedShmemSegAddr);
> + ShmemCallRequestCallbacks();
>
> It's not clear how we keep the list of registered callbacks across the
> backends and also after restart in-sync. How do we make sure that the
> callbacks registered at this time are the same callbacks registered
> before creating the shared memory? How do we make sure that the
> callbacks registered after the startup are also registered after
> restart?
On Unix systems, the registered callbacks are inherited by fork(), and
also survive over crash restart. With EXEC_BACKEND, the assumption is
that calling a library's _PG_init() function will register the same
callbacks every time. We make the same assumption today with the
shmem_startup hook.
> +void
> +RegisterShmemCallbacks(const ShmemCallbacks *callbacks)
> ... snip ...
> + foreach(lc, requested_shmem_areas)
>
> Doesn't this list contain all the areas, not just registered in this
> instance of the call. Does that mean that we need to have all the
> attach functions idempotent? Why can't we deal with the newly
> registered areas only?
registered_shmem_areas is supposed to be empty when the function is
entered. There's an assertion for that too before the foreach().
However, it's missing this, after processing the list:
list_free_deep(requested_shmem_areas);
requested_shmem_areas = NIL;
Because of that, this will fail if you load multiple extensions that
call RegisterShmemCallbacks() in the same session. Will fix that.
> + /*
> + * Extra space to reserve in the shared memory segment, but it's not part
> + * of the struct itself. This is used for shared memory hash tables that
> + * can grow beyond the initial size when more buckets are allocated.
> + */
> + size_t extra_size;
>
> When we introduce resizable structures (where even the hash table
> directly itself could be resizable), we will introduce a new field
> max_size which is easy to get confused with extra_size. Maybe we can
> rename extra_size to something like "auxilliary_size" to mean size of
> the auxiliary parts of the structure which are not part of the main
> struct itself.
>
> + /*
> + * max_size is the estimated maximum number of hashtable entries. This is
> + * not a hard limit, but the access efficiency will degrade if it is
> + * exceeded substantially (since it's used to compute directory size and
> + * the hash table buckets will get overfull).
> + */
> + size_t max_size;
> +
> + /*
> + * init_size is the number of hashtable entries to preallocate. For a
> + * table whose maximum size is certain, this should be equal to max_size;
> + * that ensures that no run-time out-of-shared-memory failures can occur.
> + */
> + size_t init_size;
>
> Everytime I look at these two fields, I question whether those are the
> number of entries (i.e. size of the hash table) or number of bytes
> (size of the memory). I know it's the former, but it indicates that
> something needs to be changed here, like changing the names to have
> _entries instead of _size, or changing the type to int64 or some such.
> Renaming to _entries would conflict with dynahash APIs since they use
> _size, so maybe the latter?
Agreed.
- Heikki
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Sami Imseih | 2026-03-27 23:17:38 | Re: another autovacuum scheduling thread |
| Previous Message | Sami Imseih | 2026-03-27 23:14:14 | Add pg_stat_autovacuum_priority |