| From: | warda Bibi <wardabibi221(at)gmail(dot)com> |
|---|---|
| To: | Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> |
| Cc: | Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Roman Khapov <rkhapov(at)yandex-team(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Additional message in pg_terminate_backend |
| Date: | 2026-06-29 10:21:23 |
| Message-ID: | CAJqHjGriG=FiwWQ4M2fOVzUwwG0Rg87fhSQgf68heNhVh=SBgA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Paul, Roman, and all,
Thank you for the thorough review, Paul. I have put together v7 addressing
all of the feedback below.
On Mon, Jun 23, 2026, Paul A Jungwirth wrote:
> Are these entries still needed? After 759b03b24c we can set defaults with
just pg_proc.dat.
Correct. v7 removes the system_functions.sql entries entirely; the DEFAULT
values are now specified only in pg_proc.dat, which is sufficient after
759b03b24c.
> What keeps the backend from being killed and the slot being re-used? v5
was checking the pid while holding the spinlock; perhaps we should still do
that.
BackendMsgSet() now acquires the slot spinlock before checking slot->pid ==
pid, ensuring that a slot reused by a new backend after the target exited
cannot be written to. The O(1) procno indexing introduced in v6 is retained.
> I don't think there is any reason to keep the _internal helper functions.
pg_cancel_backend_internal() and pg_terminate_backend_internal() have been
inlined into their respective callers, since each had only a single call
site.
> What if we emit msg as an errdetail?
Done. Both paths now use errdetail("%s", msg) instead of appending the
message to errmsg().
> Also, parallel workers receive the generic message as they hit the
> IsBackgroundWorker branch in ProcessInterrupts(), which bypasses
> BackendMsgGet(). This is fine since parallel workers are ephemeral,
> but should it be documented in the docs?
The documentation now notes that parallel workers always receive the
generic termination message.
> The v6 patch looks like an incremental patch on top of a previous
> patch (v5 I think). But v5, v4, and v3 no longer apply. Would you mind
> rebasing and putting all the changes into one patch?
Done. v7 is a single self-contained patch rebased on the current master
(b574fec00f2).
Testing:
All 7 subtests in src/test/modules/test_misc/t/014_backend_msg.pl pass on
Linux (Ubuntu 24.04, arm64):
t/014_backend_msg.pl .. ok
Result: PASS
Patch attached.
Best Regards,
Warda Bibi
On Tue, Jun 23, 2026 at 11:17 PM Paul A Jungwirth <
pj(at)illuminatedcomputing(dot)com> wrote:
> We talked about this patch for the Patch Review Workshop, and I
> thought I would share a few thoughts.
>
> On Sun, Apr 5, 2026 at 12:27 PM warda Bibi <wardabibi221(at)gmail(dot)com> wrote:
> >
> > We agree with Jim's point of view that the v3 approach is better. v5
> > introduced separate _msg overloads (OID 8223 for
> > pg_cancel_backend_msg, OID 8222 for pg_terminate_backend_msg), which
> > adds catalog bloat and forces callers to use a different function
> > name:
> >
> > > +{ oid => '8223', descr => 'cancel a server process\' current query',
> > > proname => 'pg_cancel_backend', provolatile => 'v', prorettype =>
> 'bool',
> > > proargtypes => 'int4 text', prosrc => 'pg_cancel_backend_msg' },
> > > +{ oid => '8222', descr => 'terminate a server process',
> > > proname => 'pg_terminate_backend', provolatile => 'v', prorettype =>
> 'bool',
> > > proargtypes => 'int4 int8 text', proargnames =>
> '{pid,timeout,message}',
> > > prosrc => 'pg_terminate_backend_msg' },
> >
> > V3 keeps one OID per function and stays backward-compatible. v6
> > restores the v3 approach.
>
> I agree about adding defaults to the existing functions instead of
> introducing new ones. This doesn't need to be an extension, so there
> is no need to have separate functions.
>
> > --- a/src/backend/catalog/system_functions.sql
> > +++ b/src/backend/catalog/system_functions.sql
> > @@ -378,6 +378,16 @@ BEGIN ATOMIC
> > END;
> >
> > +CREATE OR REPLACE FUNCTION
> > + pg_cancel_backend(pid integer, message text DEFAULT '')
> > + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS
> 'pg_cancel_backend'
> > + PARALLEL SAFE;
> > +
> > +CREATE OR REPLACE FUNCTION
> > + pg_terminate_backend(pid integer, timeout int8 DEFAULT 0, message
> > text DEFAULT '')
> > + RETURNS boolean STRICT VOLATILE LANGUAGE INTERNAL AS
> 'pg_terminate_backend'
> > + PARALLEL SAFE;
> > +
>
> Are these entries still needed? After 759b03b24c we can set defaults
> with just pg_proc.dat.
>
> > The current approach scans all MaxBackends slots by PID twice. Since
> > pg_signal_backend() already calls BackendPidGetProc(pid) and has the
> > PGPROC* in hand, the PGPROC index into allProcs[ ] is the ProcNumber,
> > which is also the direct index into BackendMsgSlots[]. So v6 changes
> > the signature to BackendMsgSet(ProcNumber procno, ...) and the call in
> > pg_signal_backend() passes GetNumberFromPGProc(proc), making the
> > lookup O(1).
>
> This seems like a nice improvement. But is there a race condition?
> What keeps the backend from being killed (or just finishing the
> connection) and the slot being re-used? v5 was checking the pid while
> holding the spinlock; perhaps we should still do that.
>
> > The _internal helper functions (pg_cancel_backend_internal,
> > pg_terminate_backend_internal) each have only one call site. What is
> > the intent behind keeping them separate, or can they be inlined
> > directly into pg_cancel_backend() and pg_terminate_backend()?
>
> Since we aren't implementing this as an extension, I don't think there
> is any reason to keep the _internal helper functions.
>
> > Andrey noted that:
> >
> > > We have a race condition if many backends cancel same backend.
> > > Won't they mess each other's reason?
> >
> > The spinlock in BackendMsgSet protects the write, but there is no
> > atomicity between writing the slot and delivering the signal. Another
> > backend can overwrite the slot between one writer's write and the
> > target reading it. Would it be valuable to define the behavior
> > explicitly, for example last-writer-wins, or document the limitation?
>
> I think last-writer-wins is okay for canceling the same backend twice,
> but we should make sure the slot is still owned by the right pid.
>
> > Andrey also raised this about translation:
> >
> > > Keep in mind that Postgres literals are translated into many languages.
> > > So text ought to be clear enough for translators to build a sentence
> > > that precedes termination reason.
> >
> > The current pattern:
> >
> > > errmsg("terminating connection due to administrator command: %s",
> msg)
> >
> > The colon-append pattern breaks in languages where the reason clause
> > is structured differently. Any suggestions on how to approach this, or
> > how similar cases are handled elsewhere in the codebase?
>
> What if we emit msg as an errdetail? That seems to resolve the
> translation difficulties, and it makes the logs more structured.
>
> > Also, parallel workers receive the generic message as they hit the
> > IsBackgroundWorker branch in ProcessInterrupts(), which bypasses
> > BackendMsgGet(). This is fine since parallel workers are ephemeral,
> > but should it be documented in the docs?
>
> That seems okay. Agreed about documenting it.
>
> The v6 patch looks like an incremental patch on top of a previous
> patch (v5 I think). But v5, v4, and v3 no longer apply. Would you mind
> rebasing and putting all the changes into one patch? I think your
> suggestions are extensive enough that a single file would be easier to
> review.
>
> Yours,
>
> --
> Paul ~{:-)
> pj(at)illuminatedcomputing(dot)com
>
| Attachment | Content-Type | Size |
|---|---|---|
| v7-0001-message-in-pg_terminate_backend-and-pg_cancel_backend.patch | application/octet-stream | 20.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2026-06-29 10:32:16 | Re: Support EXCEPT for ALL SEQUENCES publications |
| Previous Message | Yilin Zhang | 2026-06-29 10:13:02 | Re:Re: COPY FROM with RLS |