Re: Fix pg_stat_get_backend_wait_event() for aux processes

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix pg_stat_get_backend_wait_event() for aux processes
Date: 2026-02-05 20:43:38
Message-ID: CAA5RZ0sPYBVqiU3Fxsh5Y8P3r59zikXKW5i16hcruf-utaLQPw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > > There is also a discussion [0] about wait event/activity field
> > > inconsistency
> > > with pg_stat_activity with a repro in [1].
> >
> >
> > The repro I was referring to in [1] is actually
> > https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru
>
> That is inherent. The wait event is updated in an unsynchronized fashion. As
> noted in that thread.
>
> Making it synchronized (via st_changecount) would make wait event overhead
> vastly higher.

Correct, and I don't think we should pay the overhead of ensuring
strict synchronization.

The patch that Heikkei is proposing [0] will update wait_event_info bypassing
st_changecount.

```
+++ b/src/include/utils/backend_status.h
@@ -174,6 +174,15 @@ typedef struct PgBackendStatus

/* plan identifier, optionally computed using planner_hook */
int64 st_plan_id;
+
+ /*
+ * Proc's wait information. This *not* protected by the changecount
+ * mechanism, because reading and writing an uint32 is assumed
to atomic.
+ * This is updated very frequently, so we want to keep the overhead as
+ * small as possible.
+ */
+ uint32 st_wait_event_info;
+
```

Which is the same assumption we are already making.

```
static inline void
pgstat_report_wait_start(uint32 wait_event_info)
{
/*
* Since this is a four-byte field which is always read and written as
* four-bytes, updates are atomic.
*/
*(volatile uint32 *) my_wait_event_info = wait_event_info;
}
```

So without st_changecount, wait_events could still be out of sync on with
other backend status fields (i.e. state), but at least my testing shows
better results with [0] applied. Better results here are no samples
with " state = active, wait_event = ClientRead and backend_type =
client backend"

Using the attached repro.sh (attached):

## without patch
```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
238

# of sampled of state = active + wait_event
11774 transactionid
4054 tuple
489 CPU
444 LockManager
343 WALWrite
238 ClientRead
69 WalSync
15 BufferExclusive
2 VacuumDelay
2 BufferShared
1 DataFileWrite
```

## with patch [0]

```
# of samplase of state = active, wait_event = ClientRead and
backend_type = client backend
0

# of sampled of state = active + wait_event
12516 transactionid
3188 tuple
504 WALWrite
414 CPU
411 LockManager
86 WalSync
7 BufferExclusive
4 BufferShared
2 VacuumDelay
```

Also, it's worth noting that we bypass st_changecount for VACUUM +
ANALYZE delay time.
This was discussed in [2]

vacuumlazy.c:

```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_ANALYZE_DELAY_TIME] / 1000000.0);
}
```

commands/analyze.c:

```
if (track_cost_delay_timing)
{
/*
* We bypass the changecount mechanism because this value is
* only updated by the calling process. We also rely on the
* above call to pgstat_progress_end_command() to not clear
* the st_progress_param array.
*/
appendStringInfo(&buf, _("delay time: %.3f ms\n"),
(double) MyBEEntry->st_progress_param[PROGRESS_VACUUM_DELAY_TIME] / 1000000.0);
}
```

[0] https://www.postgresql.org/message-id/459e78c0-927f-4347-86df-ca431567c95a%40iki.fi
[1] https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru
[2] https://www.postgresql.org/message-id/Z6-tiXbLwHYyDeNy%40nathan

--
Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
repro.sh application/x-sh 781 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-05 20:52:30 Unfortunate pushing down of expressions below sort
Previous Message Zsolt Parragi 2026-02-05 20:35:04 Re: pg_dumpall --roles-only interact with other options