Re: Additional message in pg_terminate_backend

From: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: warda Bibi <wardabibi221(at)gmail(dot)com>
Cc: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Roman Khapov <rkhapov(at)yandex-team(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Additional message in pg_terminate_backend
Date: 2026-06-23 18:16:47
Message-ID: CA+renyUsB3T07L14naz7P8gZz1zz8x=3MJcQnn5BWmwjx=FqSw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We talked about this patch for the Patch Review Workshop, and I
thought I would share a few thoughts.

On Sun, Apr 5, 2026 at 12:27 PM warda Bibi <wardabibi221(at)gmail(dot)com> wrote:
>
> We agree with Jim's point of view that the v3 approach is better. v5
> introduced separate _msg overloads (OID 8223 for
> pg_cancel_backend_msg, OID 8222 for pg_terminate_backend_msg), which
> adds catalog bloat and forces callers to use a different function
> name:
>
> > +{ oid => '8223', descr => 'cancel a server process\' current query',
> > proname => 'pg_cancel_backend', provolatile => 'v', prorettype => 'bool',
> > proargtypes => 'int4 text', prosrc => 'pg_cancel_backend_msg' },
> > +{ oid => '8222', descr => 'terminate a server process',
> > proname => 'pg_terminate_backend', provolatile => 'v', prorettype => 'bool',
> > proargtypes => 'int4 int8 text', proargnames => '{pid,timeout,message}',
> > prosrc => 'pg_terminate_backend_msg' },
>
> V3 keeps one OID per function and stays backward-compatible. v6
> restores the v3 approach.

I agree about adding defaults to the existing functions instead of
introducing new ones. This doesn't need to be an extension, so there
is no need to have separate functions.

> --- a/src/backend/catalog/system_functions.sql
> +++ b/src/backend/catalog/system_functions.sql
> @@ -378,6 +378,16 @@ BEGIN ATOMIC
> END;
>
> +CREATE OR REPLACE FUNCTION
> + pg_cancel_backend(pid integer, message text DEFAULT '')
> + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_cancel_backend'
> + PARALLEL SAFE;
> +
> +CREATE OR REPLACE FUNCTION
> + pg_terminate_backend(pid integer, timeout int8 DEFAULT 0, message
> text DEFAULT '')
> + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS 'pg_terminate_backend'
> + PARALLEL SAFE;
> +

Are these entries still needed? After 759b03b24c we can set defaults
with just pg_proc.dat.

> The current approach scans all MaxBackends slots by PID twice. Since
> pg_signal_backend() already calls BackendPidGetProc(pid) and has the
> PGPROC* in hand, the PGPROC index into allProcs[ ] is the ProcNumber,
> which is also the direct index into BackendMsgSlots[]. So v6 changes
> the signature to BackendMsgSet(ProcNumber procno, ...) and the call in
> pg_signal_backend() passes GetNumberFromPGProc(proc), making the
> lookup O(1).

This seems like a nice improvement. But is there a race condition?
What keeps the backend from being killed (or just finishing the
connection) and the slot being re-used? v5 was checking the pid while
holding the spinlock; perhaps we should still do that.

> The _internal helper functions (pg_cancel_backend_internal,
> pg_terminate_backend_internal) each have only one call site. What is
> the intent behind keeping them separate, or can they be inlined
> directly into pg_cancel_backend() and pg_terminate_backend()?

Since we aren't implementing this as an extension, I don't think there
is any reason to keep the _internal helper functions.

> Andrey noted that:
>
> > We have a race condition if many backends cancel same backend.
> > Won't they mess each other's reason?
>
> The spinlock in BackendMsgSet protects the write, but there is no
> atomicity between writing the slot and delivering the signal. Another
> backend can overwrite the slot between one writer's write and the
> target reading it. Would it be valuable to define the behavior
> explicitly, for example last-writer-wins, or document the limitation?

I think last-writer-wins is okay for canceling the same backend twice,
but we should make sure the slot is still owned by the right pid.

> Andrey also raised this about translation:
>
> > Keep in mind that Postgres literals are translated into many languages.
> > So text ought to be clear enough for translators to build a sentence
> > that precedes termination reason.
>
> The current pattern:
>
> > errmsg("terminating connection due to administrator command: %s", msg)
>
> The colon-append pattern breaks in languages where the reason clause
> is structured differently. Any suggestions on how to approach this, or
> how similar cases are handled elsewhere in the codebase?

What if we emit msg as an errdetail? That seems to resolve the
translation difficulties, and it makes the logs more structured.

> Also, parallel workers receive the generic message as they hit the
> IsBackgroundWorker branch in ProcessInterrupts(), which bypasses
> BackendMsgGet(). This is fine since parallel workers are ephemeral,
> but should it be documented in the docs?

That seems okay. Agreed about documenting it.

The v6 patch looks like an incremental patch on top of a previous
patch (v5 I think). But v5, v4, and v3 no longer apply. Would you mind
rebasing and putting all the changes into one patch? I think your
suggestions are extensive enough that a single file would be easier to
review.

Yours,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-06-23 18:16:52 Re: psql: Fix CREATE SCHEMA scanning of nested routine bodies
Previous Message Peter Geoghegan 2026-06-23 18:15:47 Re: [GSoC 2026] - B-tree Index Bloat Reduction - Approach & Questions