| From: | 신성준 <shinsj4653(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Cc: | Kirk Wolak <wolakk(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Andreas Karlsson <andreas(at)proxel(dot)se>, Nikolay Samokhvalov <nik(at)postgres(dot)ai> |
| Subject: | Re: Add wait events for server logging destination writes |
| Date: | 2026-05-31 10:42:41 |
| Message-ID: | CACdN0M4ENFV0Hg4zbBkP+ScRHdeN9NdYx9QV79h5BM_eN9YxHA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
cfbot caught a build failure on v1, in the SanityCheck task on Linux
and Windows: elog.c uses pgstat_report_wait_start()/end() and the
WAIT_EVENT_* constants but didn't include utils/wait_event.h. It only
built here because of an accidental transitive include on my machine;
on the CI images the declarations weren't visible.
v2 fixes that by adding the missing #include "utils/wait_event.h" to
elog.c, folded into 0001 so that patch builds on its own. No other
changes; the wait events and the reported write paths are the same as
in v1.
v2-0001 adds the two events and covers the write(2) paths.
v2-0002 covers the Windows WriteConsoleW() path, split out as before.
Applies cleanly on current master; full build passes locally.
Thanks,
Seongjun Shin
2026년 5월 31일 (일) 오후 5:50, 신성준 <shinsj4653(at)gmail(dot)com>님이 작성:
>
> Hi hackers,
>
> The write(2) calls that flush server log output aren't covered by wait
> events. When a backend logs something, the writes go out in:
>
> - write_pipe_chunks(): write(2) to the syslogger pipe
> - write_console(): write(2) to stderr (WriteConsoleW() on Windows)
>
> If one of those blocks -- syslogger pipe full, slow console, slow log
> device -- pg_stat_activity just shows wait_event = NULL until it
> returns. Since NULL usually reads as "on CPU", a backend stuck writing
> logs looks like it's doing work, so logging-related stalls are easy to
> miss.
>
> Attached is a short series that adds two WaitEventIO events and reports
> them around those writes:
>
> IO / SysloggerWrite - write(2) to the syslogger pipe
> IO / StderrWrite - write(2) to stderr, and WriteConsoleW()
>
> 0001 adds the events and covers the write(2) paths. 0002 does the
> Windows WriteConsoleW() path, split out since it's platform-specific.
>
> It only wraps the leaf write call and uses the existing
> pgstat_report_wait_start()/end() helpers, so it stays allocation-free
> and safe to call from inside the error-reporting path.
>
> I did a quick before/after to make sure the events show up: 8 backends
> each emitting large RAISE LOG lines, sampling wait_event from
> pg_stat_activity every 50 ms for 20 s.
>
> - logging_collector = on (syslogger pipe):
> master: NULL 100.0% (2184/2184)
> patched: IO/SysloggerWrite 99.1% (2204/2224), NULL 0.9%
>
> - logging_collector = off (stderr):
> master: NULL 100.0% (2144/2144)
> patched: IO/StderrWrite 90.7% (1952/2152), NULL 9.3%
>
> On master that wait time is just invisible; with the patch it lands on
> the new events. I can send the scripts and raw samples if anyone wants
> to reproduce it.
>
> Applies on current master. A couple of things I'm unsure about and
> would appreciate input on: whether the event names fit the surrounding
> conventions, and whether splitting the Windows path into its own patch
> is the right call.
>
> Thanks,
> Seongjun Shin
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-Add-wait-events-for-server-logging-destination-wr.patch | application/octet-stream | 3.8 KB |
| v2-0002-Report-StderrWrite-wait-event-around-WriteConsole.patch | application/octet-stream | 1.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Lakhin | 2026-05-31 11:00:01 | Re: Exit walsender before confirming remote flush in logical replication |
| Previous Message | Tatsuya Kawata | 2026-05-31 09:59:41 | [PATCH] pg_stat_lock: add blocker mode dimension |