Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Nathan Bossart <nathan(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent
Date: 2025-11-22 15:25:48
Message-ID: CA+TgmoZsxbVhoxNGTWSd6=vHvyoqOSDNXJZabBZx+St8uPduLQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Nov 20, 2025 at 6:09 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> Unpinning/detaching the segment/DSA/dshash table and deleting the DSM
> registry entry in a PG_CATCH block scares me a little, but it might be
> doable.

It seems a bit weird to be doing explicit unpinning in a PG_CATCH
block. Ideally you'd want to postpone the pinning until initialization
has succeeded, so that if you fail before that, transaction cleanup
takes care of it automatically. Alternatively, destroying what existed
before could be deferred until later, when an as-yet-unfailed
transaction stumbles across the tombstone.

> Another thing that might be subconsciously guiding my decisions here is the
> existing behavior when a shmem request/startup hook ERRORs (server startup
> fails). I'd expect DSM registry users to be doing similar things in their
> initialization callbacks, and AFAIK this behavior hasn't been a source of
> complaints.

What bugs me here is the fact that you have a perfectly good working
server that you can't "salvage". If the server had failed at startup,
that would have sucked, but you would have been forced into correcting
the problem (or giving up on the extension) and restarting, so once
the server gets up and running, it's always in a good state. With this
mechanism, you can get a running server that's stuck in this
failed-initialization state, and there's no way for the DBA to do
anything except by bouncing the entire server. That seems like it
could be really frustrating. Now, if it's the case that resetting this
mechanism wouldn't fundamentally fix anything because the
reinitialization attempt would inevitably fail for the same reason,
then I suppose it's not so bad, but I'm not sure that would always be
true. I just feel like one-way state transitions from good->bad are
undesirable. One way of viewing log levels like ERROR, FATAL, and
PANIC is that they force you to do a reset of the transaction,
session, or entire server to get back to a good state, but here you
just get stuck in the bad one.

Am I worrying too much? Possibly! But as I said to David on another
thread this morning, it's better to worry on pgsql-hackers before any
problem happens than to start worrying after something bad happens in
a customer situation.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Bruce Momjian 2025-11-22 17:02:17 pgsql: tools: remove src/tools/codelines
Previous Message Peter Eisentraut 2025-11-22 08:48:00 pgsql: Add range_minus_multi and multirange_minus_multi functions

Browse pgsql-hackers by date

  From Date Subject
Next Message ocean_li_996 2025-11-22 16:05:49 Re: Fix logical decoding not track transaction duringSNAPBUILD_BUILDING_SNAPSHOT
Previous Message Álvaro Herrera 2025-11-22 14:16:44 Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY