Re: Additional message in pg_terminate_backend

From: warda Bibi <wardabibi221(at)gmail(dot)com>
To: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Roman Khapov <rkhapov(at)yandex-team(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Additional message in pg_terminate_backend
Date: 2026-06-29 10:21:23
Message-ID: CAJqHjGriG=FiwWQ4M2fOVzUwwG0Rg87fhSQgf68heNhVh=SBgA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Paul, Roman, and all,

Thank you for the thorough review, Paul. I have put together v7 addressing
all of the feedback below.

On Mon, Jun 23, 2026, Paul A Jungwirth wrote:

> Are these entries still needed? After 759b03b24c we can set defaults with
just pg_proc.dat.

Correct. v7 removes the system_functions.sql entries entirely; the DEFAULT
values are now specified only in pg_proc.dat, which is sufficient after
759b03b24c.

> What keeps the backend from being killed and the slot being re-used? v5
was checking the pid while holding the spinlock; perhaps we should still do
that.

BackendMsgSet() now acquires the slot spinlock before checking slot->pid ==
pid, ensuring that a slot reused by a new backend after the target exited
cannot be written to. The O(1) procno indexing introduced in v6 is retained.

> I don't think there is any reason to keep the _internal helper functions.

pg_cancel_backend_internal() and pg_terminate_backend_internal() have been
inlined into their respective callers, since each had only a single call
site.

> What if we emit msg as an errdetail?

Done. Both paths now use errdetail("%s", msg) instead of appending the
message to errmsg().

> Also, parallel workers receive the generic message as they hit the

> IsBackgroundWorker branch in ProcessInterrupts(), which bypasses

> BackendMsgGet(). This is fine since parallel workers are ephemeral,

> but should it be documented in the docs?

The documentation now notes that parallel workers always receive the
generic termination message.

> The v6 patch looks like an incremental patch on top of a previous

> patch (v5 I think). But v5, v4, and v3 no longer apply. Would you mind

> rebasing and putting all the changes into one patch?

Done. v7 is a single self-contained patch rebased on the current master
(b574fec00f2).

Testing:

All 7 subtests in src/test/modules/test_misc/t/014_backend_msg.pl pass on
Linux (Ubuntu 24.04, arm64):

t/014_backend_msg.pl .. ok

Result: PASS

Patch attached.

Best Regards,

Warda Bibi

On Tue, Jun 23, 2026 at 11:17 PM Paul A Jungwirth <
pj(at)illuminatedcomputing(dot)com> wrote:

> We talked about this patch for the Patch Review Workshop, and I
> thought I would share a few thoughts.
>
> On Sun, Apr 5, 2026 at 12:27 PM warda Bibi <wardabibi221(at)gmail(dot)com> wrote:
> >
> > We agree with Jim's point of view that the v3 approach is better. v5
> > introduced separate _msg overloads (OID 8223 for
> > pg_cancel_backend_msg, OID 8222 for pg_terminate_backend_msg), which
> > adds catalog bloat and forces callers to use a different function
> > name:
> >
> > > +{ oid => '8223', descr => 'cancel a server process\' current query',
> > > proname => 'pg_cancel_backend', provolatile => 'v', prorettype =>
> 'bool',
> > > proargtypes => 'int4 text', prosrc => 'pg_cancel_backend_msg' },
> > > +{ oid => '8222', descr => 'terminate a server process',
> > > proname => 'pg_terminate_backend', provolatile => 'v', prorettype =>
> 'bool',
> > > proargtypes => 'int4 int8 text', proargnames =>
> '{pid,timeout,message}',
> > > prosrc => 'pg_terminate_backend_msg' },
> >
> > V3 keeps one OID per function and stays backward-compatible. v6
> > restores the v3 approach.
>
> I agree about adding defaults to the existing functions instead of
> introducing new ones. This doesn't need to be an extension, so there
> is no need to have separate functions.
>
> > --- a/src/backend/catalog/system_functions.sql
> > +++ b/src/backend/catalog/system_functions.sql
> > @@ -378,6 +378,16 @@ BEGIN ATOMIC
> > END;
> >
> > +CREATE OR REPLACE FUNCTION
> > + pg_cancel_backend(pid integer, message text DEFAULT '')
> > + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS
> 'pg_cancel_backend'
> > + PARALLEL SAFE;
> > +
> > +CREATE OR REPLACE FUNCTION
> > + pg_terminate_backend(pid integer, timeout int8 DEFAULT 0, message
> > text DEFAULT '')
> > + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS
> 'pg_terminate_backend'
> > + PARALLEL SAFE;
> > +
>
> Are these entries still needed? After 759b03b24c we can set defaults
> with just pg_proc.dat.
>
> > The current approach scans all MaxBackends slots by PID twice. Since
> > pg_signal_backend() already calls BackendPidGetProc(pid) and has the
> > PGPROC* in hand, the PGPROC index into allProcs[ ] is the ProcNumber,
> > which is also the direct index into BackendMsgSlots[]. So v6 changes
> > the signature to BackendMsgSet(ProcNumber procno, ...) and the call in
> > pg_signal_backend() passes GetNumberFromPGProc(proc), making the
> > lookup O(1).
>
> This seems like a nice improvement. But is there a race condition?
> What keeps the backend from being killed (or just finishing the
> connection) and the slot being re-used? v5 was checking the pid while
> holding the spinlock; perhaps we should still do that.
>
> > The _internal helper functions (pg_cancel_backend_internal,
> > pg_terminate_backend_internal) each have only one call site. What is
> > the intent behind keeping them separate, or can they be inlined
> > directly into pg_cancel_backend() and pg_terminate_backend()?
>
> Since we aren't implementing this as an extension, I don't think there
> is any reason to keep the _internal helper functions.
>
> > Andrey noted that:
> >
> > > We have a race condition if many backends cancel same backend.
> > > Won't they mess each other's reason?
> >
> > The spinlock in BackendMsgSet protects the write, but there is no
> > atomicity between writing the slot and delivering the signal. Another
> > backend can overwrite the slot between one writer's write and the
> > target reading it. Would it be valuable to define the behavior
> > explicitly, for example last-writer-wins, or document the limitation?
>
> I think last-writer-wins is okay for canceling the same backend twice,
> but we should make sure the slot is still owned by the right pid.
>
> > Andrey also raised this about translation:
> >
> > > Keep in mind that Postgres literals are translated into many languages.
> > > So text ought to be clear enough for translators to build a sentence
> > > that precedes termination reason.
> >
> > The current pattern:
> >
> > > errmsg("terminating connection due to administrator command: %s",
> msg)
> >
> > The colon-append pattern breaks in languages where the reason clause
> > is structured differently. Any suggestions on how to approach this, or
> > how similar cases are handled elsewhere in the codebase?
>
> What if we emit msg as an errdetail? That seems to resolve the
> translation difficulties, and it makes the logs more structured.
>
> > Also, parallel workers receive the generic message as they hit the
> > IsBackgroundWorker branch in ProcessInterrupts(), which bypasses
> > BackendMsgGet(). This is fine since parallel workers are ephemeral,
> > but should it be documented in the docs?
>
> That seems okay. Agreed about documenting it.
>
> The v6 patch looks like an incremental patch on top of a previous
> patch (v5 I think). But v5, v4, and v3 no longer apply. Would you mind
> rebasing and putting all the changes into one patch? I think your
> suggestions are extensive enough that a single file would be easier to
> review.
>
> Yours,
>
> --
> Paul ~{:-)
> pj(at)illuminatedcomputing(dot)com
>

Attachment Content-Type Size
v7-0001-message-in-pg_terminate_backend-and-pg_cancel_backend.patch application/octet-stream 20.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2026-06-29 10:32:16 Re: Support EXCEPT for ALL SEQUENCES publications
Previous Message Yilin Zhang 2026-06-29 10:13:02 Re:Re: COPY FROM with RLS