| From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: 64-bit wait_event and introduction of 32-bit wait_event_arg |
| Date: | 2026-02-16 11:34:17 |
| Message-ID: | CAKZiRmzUZS2G5SeZBnOXrJuFNnj9-sfY3o+nKHSP92Cq8uEiHA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Feb 13, 2026 at 5:46 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, Feb 12, 2026 at 01:42:23PM +0100, Jakub Wartak wrote:
Hi Michael, thanks for taking a look at this!
> > What's left:
> > - Earlier Heikki raised the question "Wait events can be defined in extensions;
> > how does an extension plug into this facility?" - that's still unanswered.
>
> Reserving the full 8 bytes to the callers of WaitEventExtensionNew()
> and WaitEventInjectionPointNew() would be an error, because we would
> forever lock down the possibility for extensions to set at will the 4
> extra bytes that become available when setting some extra data in
> parallel of a wait event name. The results of these routines should
> still be 4 bytes for the "static" part of the wait event names, not 8.
When trying to understand this I found that the patch contained assumption
error on my side that might have mislead You: I was blind to the fact that
all custom wait events land in PG_WAIT_EXTENSION class, so we need in
WaitEventCustomNew() to shift the resulting eventId from nextId++ to the
proper place in 64-bit word (this leaves the 32-bits part (for
wait_event_arg) initially as zero when returned by those routines)
.. and then, well, the registration of the new custom wait event is something
distinct from starting the waiting on it, right? I mean the extension
could:
// it will contain PG_WAIT_EXTENSION | shifted_eventId_from_nextId
// + zero on all 32-bit lower bits (for wait_event_arg)
uint64 custom_wait_event = WaitEventExtensionNew("custom");
libpqsrv_connect(..,custom_wait_event);
..
// and we can still add 32-bit argument there
libpqsrv_connect(..,custom_wait_event | some_wait_event_arg);
Because probably the code was misleading Could I ask You to
re-evaluate the above
concern based on the fixed code?
> > I think they could just OR 32-bit value themselves, but maybe we could
> > just provide a way to plug into pg_get_wait_events().waiteventarg_description?
>
> The value provided back to pg_stat_activity would be a 4-byte integer
> under this design, whose interpretation is up to the client, I guess,
> with a filter based on the wait event name found (likely a CASE/ELSE
> to force casts back to a text value at the end in most cases?). That
> may be annoying for client applications, though, but perhaps
> acceptable as this provides extra information with a single atomic
> write.
Yes, today we just return the lower half of 64-bits as
pg_stat_activity.wait_event_arg::integer. The usage could be be interpreted
by the SQL query doing CASE/ELSE if necessary (or just by human when directly
looking at the pg_get_wait_events().waiteventarg_description -- that is what
is implemented today for the IO/Slru* wait event and their wait_event_arg),
e.g.:
postgres=# select type, substring(name, 1, 20) wait,
waiteventarg_description as desc
from pg_get_wait_events() where type = 'IO' and name like 'Slru%';
type | wait |
desc
------+---------------+----------------------------------------------------------------------------------------------------------------------------------
IO | SlruFlushSync | SlruType: unknown(0), notify(1), clog(2),
subtrans(3), committs(4), multixactoffset (5), multixactmembers(6),
serialializable(7)
IO | SlruRead | SlruType: unknown(0), notify(1), clog(2),
subtrans(3), committs(4), multixactoffset (5), multixactmembers(6),
serialializable(7)
[..]
so one of course could write something like to interpret it:
SELECT *, CASE WHEN wait_event_arg=0 THEN 'unknown'
WHEN wait_event_arg=1 THEN 'notify'
[..]
ELSE 'other'
END as
FROM pg_stat_activity WHERE wait_event LIKE 'Slru%';
> At the end, the way these 4 extra bytes can be set by extensions is an
> API problem for me, and I suspect that the correct way to extend
> things, on top of forcing the use of 4 bytes for the ID of the fixed
> event ID (perhaps just define a type here anyway?), would be to patch
> the most popular APIs that extensions currently use to let them set
> the value they want for the extra 4 bytes. The first choice that
> comes into mind here is the family of WaitLatchOrSocket() APIs, that
> could have an extra argument with a uint32 for the extra data. That's
> a popular one among extension developers.
Right now in the patchset, the WaitLatchOrSocket() and WaitLatch() take
uint64 argument, I've done that so that we I could minimize the patchset
footprint and then e.g. I do still:
- rc = WaitLatch(MyLatch, ..., WAIT_EVENT_SYNC_REP);
+ rc = WaitLatch(MyLatch, ..., WAIT_EVENT_SYNC_REP | wait_event_arg_pid);
(so uint64 that is OR-ed with CLASS and 32-bit lower e.g. pid here)
so without many internal unnecessary changes as WaitLatch() is used heavily.
BTW: note that all of the WAIT_EVENT_* are now 64-bits, so it allows to
work that way.
One problem with that approach is that with WaitLatchOrSocket()/WaitLatch()
the extensions would have to be altered (the extension code can pass uint32
silently - it will compile and won't show a good wait event in runtime).
Sadly this won't be detected without -Wconversion (silent casting will
uint32->uint64) and I don't have other idea how silent casting could
be detected during extension compilation, as even
`#pragma GCC diagnostic warning "-Wconversion"` won't be effective in
3rd party code using this.
> By the way, patch 0001 includes a log file from pg_plan_advice with
> some information I suspect you did not intend to send..
Oops, fixed, thanks.
New version attached, it also fixed the dblink bug where it was not showing
proper wait events (missing uint32 to uint64 change).
Probably one thing remaining is also the API about pg_get_wait_events(). To
me it looks like:
- the extensions do not have way to pass description (it's just static
"Waiting for custom wait event \"%s\" defined by extension module")
- therefore they also miss way to pass waiteventarg_description
I think there are two ways there:
a) either display static text for waiteventarg_description like:
"consult extension documentation for details"
b) or we could add some more extended way to initialize:
WaitEventExtensionNew(
const char *wait_event_name,
const char *wait_event_desc,
const char *wait_event_arg_desc);
but I seriously doubt it would be used by anyone, but maybe I'm wrong.
-J.
| Attachment | Content-Type | Size |
|---|---|---|
| v5-0006-wait_event_arg-expose-buffer-for-Buffer-type-wait.patch | text/x-patch | 3.7 KB |
| v5-0003-wait_event_arg-implement-SLRU-type-reporting-for-.patch | text/x-patch | 11.1 KB |
| v5-0002-wait_event_arg-expose-slowest-standby-PID-for-IPC.patch | text/x-patch | 2.8 KB |
| v5-0005-wait_event_arg-report-number-of-spinlock-delays-f.patch | text/x-patch | 2.1 KB |
| v5-0004-Expose-meaning-of-new-per-wait-wait_event_arg-thr.patch | text/x-patch | 11.3 KB |
| v5-0001-Convert-wait_event_info-to-64-t-bits-expose-lower.patch | text/x-patch | 67.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Reshmithaa | 2026-02-16 11:54:34 | Re: Crash related to Shared Memory |
| Previous Message | lakshmi | 2026-02-16 11:26:13 | Re: Add a greedy join search algorithm to handle large join problems |