Re: Show WAL write and fsync stats in pg_stat_io

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "bharath(dot)rupireddyforpostgres(at)gmail(dot)com" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Show WAL write and fsync stats in pg_stat_io
Date: 2023-12-25 06:20:58
Message-ID: ZYkfSkorj6OgD6ZM@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 16, 2023 at 08:20:57PM +0100, Michael Paquier wrote:
> One thing that 0001 missed is an update of the header where the
> function is declared. I've edited a few things, and applied it to
> start on this stuff. The rest will have to wait a bit more..

I have been reviewing the whole, and spotted a couple of issues.

+ * At the end of the if case, accumulate time for the pg_stat_io.
+ */
+ if (pgstat_should_track_io_time(io_object, io_context))

There was a bug here. WAL operations can do IOOP_WRITE or IOOP_READ,
and this would cause pgstat_count_buffer_read_time() and
pgstat_count_buffer_write_time() to be called, incrementing
pgStatBlock{Read,Write}Time, which would be incorrect when it comes to
a WAL page or a WAL segment. I was wondering what to do here first,
but we could just avoid calling these routines when working on an
IOOBJECT_WAL as that's the only object not doing a buffer operation.

A comment at the top of pgstat_tracks_io_bktype() is incorrect,
because this patch adds the WAL writer sender in the I/O tracking.

+ case B_WAL_RECEIVER:
+ case B_WAL_SENDER:
+ case B_WAL_WRITER:
+ return false;

pgstat_tracks_io_op() now needs B_WAL_SUMMARIZER.

pgstat_should_track_io_time() is used only in pgstat_io.c, so it can
be static rather than published in pgstat.h.

pgstat_tracks_io_bktype() does not look correct to me. Why is the WAL
receiver considered as something correct in the list of backend types,
while the intention is to *not* add it to pg_stat_io? I have tried to
switche to the correct behavior of returning false for a
B_WAL_RECEIVER, to notice that pg_rewind's test 002_databases.pl
freezes on its shutdown sequence. Something weird is going on here.
Could you look at it? See the XXX comment in the attached, which is
the same behavior as v6-0002. It looks to me that the patch has
introduced an infinite loop tweaking pgstat_tracks_io_bktype() in an
incorrect way to avoid the root issue.

I have also spent more time polishing the rest, touching a few things
while reviewing. Not sure that I see a point in splitting the tests
from the main patch.
--
Michael

Attachment Content-Type Size
v7-0001-Show-WAL-stats-on-pg_stat_io-except-streaming.patch text/x-diff 29.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-12-25 06:40:17 Re: Show WAL write and fsync stats in pg_stat_io
Previous Message Kyotaro Horiguchi 2023-12-25 05:51:24 Should "CRC" be in uppercase?