| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Fix pg_stat_get_backend_wait_event() for aux processes |
| Date: | 2026-02-05 20:43:38 |
| Message-ID: | CAA5RZ0sPYBVqiU3Fxsh5Y8P3r59zikXKW5i16hcruf-utaLQPw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> > > There is also a discussion [0] about wait event/activity field
> > > inconsistency
> > > with pg_stat_activity with a repro in [1].
> >
> >
> > The repro I was referring to in [1] is actually
> > https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru
>
> That is inherent. The wait event is updated in an unsynchronized fashion. As
> noted in that thread.
>
> Making it synchronized (via st_changecount) would make wait event overhead
> vastly higher.
Correct, and I don't think we should pay the overhead of ensuring
strict synchronization.
The patch that Heikkei is proposing [0] will update wait_event_info bypassing
st_changecount.
```
+++ b/src/include/utils/backend_status.h
@@ -174,6 +174,15 @@ typedef struct PgBackendStatus
/* plan identifier, optionally computed using planner_hook */
int64 st_plan_id;
+
+ /*
+ * Proc's wait information. This *not* protected by the changecount
+ * mechanism, because reading and writing an uint32 is assumed
to atomic.
+ * This is updated very frequently, so we want to keep the overhead as
+ * small as possible.
+ */
+ uint32 st_wait_event_info;
+
```
Which is the same assumption we are already making.
```
static inline void
pgstat_report_wait_start(uint32 wait_event_info)
{
/*
* Since this is a four-byte field which is always read and written as
* four-bytes, updates are atomic.
*/
*(volatile uint32 *) my_wait_event_info = wait_event_info;
}
```
So without st_changecount, wait_events could still be out of sync on with
other backend status fields (i.e. state), but at least my testing shows
better results with [0] applied. Better results here are no samples
with " state = active, wait_event = ClientRead and backend_type =
client backend"
Using the attached repro.sh (attached):
## without patch
```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
238
# of sampled of state = active + wait_event
11774 transactionid
4054 tuple
489 CPU
444 LockManager
343 WALWrite
238 ClientRead
69 WalSync
15 BufferExclusive
2 VacuumDelay
2 BufferShared
1 DataFileWrite
```
## with patch [0]
```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
0
# of sampled of state = active + wait_event
12516 transactionid
3188 tuple
504 WALWrite
414 CPU
411 LockManager
86 WalSync
7 BufferExclusive
4 BufferShared
2 VacuumDelay
```
Also, it's worth noting that we bypass st_changecount for VACUUM +
ANALYZE delay time.
This was discussed in [2]
vacuumlazy.c:
```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_ANALYZE_DELAY_TIME] / 1000000.0);
}
```
commands/analyze.c:
```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process. We also rely on the
* above call to pgstat_progress_end_command() to not clear
* the st_progress_param array.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_VACUUM_DELAY_TIME] / 1000000.0);
}
```
[0] https://www.postgresql.org/message-id/459e78c0-927f-4347-86df-ca431567c95a%40iki.fi
[1] https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru
[2] https://www.postgresql.org/message-id/Z6-tiXbLwHYyDeNy%40nathan
--
Sami Imseih
Amazon Web Services (AWS)
| Attachment | Content-Type | Size |
|---|---|---|
| repro.sh | application/x-sh | 781 bytes |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-02-05 20:52:30 | Unfortunate pushing down of expressions below sort |
| Previous Message | Zsolt Parragi | 2026-02-05 20:35:04 | Re: pg_dumpall --roles-only interact with other options |