Re: Show WAL write and fsync stats in pg_stat_io

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Show WAL write and fsync stats in pg_stat_io
Date: 2024-05-13 14:12:11
Message-ID: CALj2ACV6wM8KtqcU+6mZP=vTMSPskVbqXUwM43QyQmaNUWAq6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 19, 2024 at 1:32 PM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> > I wanted to inform you that the 73f0a13266 commit changed all WALRead
> > calls to read variable bytes, only the WAL receiver was reading
> > variable bytes before.
>
> I want to start working on this again if possible. I will try to
> summarize the current status:

Thanks for working on this.

> * With the 73f0a13266 commit, the WALRead() function started to read
> variable bytes in every case. Before, only the WAL receiver was
> reading variable bytes.
>
> * With the 91f2cae7a4 commit, WALReadFromBuffers() is merged. We were
> discussing what we have to do when this is merged. It is decided that
> WALReadFromBuffers() does not call pgstat_report_wait_start() because
> this function does not perform any IO [1]. We may follow the same
> logic by not including these to pg_stat_io?

Right. WALReadFromBuffers doesn't do any I/O.

Whoever reads WAL from disk (backends, walsenders, recovery process)
using pg_pread (XLogPageRead, WALRead) needs to be tracked in
pg_stat_io or some other view. If it were to be in pg_stat_io,
although we may not be able to distinguish WAL read stats at a backend
level (like how many times/bytes a walsender or recovery process or a
backend read WAL from disk), but it can help understand overall impact
of WAL read I/O at a cluster level. With this approach, the WAL I/O
stats are divided up - WAL read I/O and write I/O stats are in
pg_stat_io and pg_stat_wal respectively.

This makes me think if we need to add WAL read I/O stats also to
pg_stat_wal. Then, we can also add WALReadFromBuffers stats
hits/misses there. With this approach, pg_stat_wal can be a one-stop
view for all the WAL related stats. If needed, we can join info from
pg_stat_wal to pg_stat_io in system_views.sql so that the I/O stats
are emitted to the end-user via pg_stat_io.

> * With the b5a9b18cd0 commit, streaming I/O is merged but AFAIK this
> does not block anything related to putting WAL stats in pg_stat_io.
>
> If I am not missing any new changes, the only problem is reading
> variable bytes now. We have discussed a couple of solutions:
>
> 1- Change op_bytes to something like -1, 0, 1, NULL etc. and document
> that this means some variable byte I/O is happening.
>
> I kind of dislike this solution because if the *only* read I/O is
> happening in variable bytes, it will look like write and extend I/Os
> are happening in variable bytes as well. As a solution, it could be
> documented that only read I/Os could happen in variable bytes for now.

Yes, read I/O for relation and WAL can happen in variable bytes. I
think this idea seems reasonable and simple yet useful to know the
cluster-wide read I/O.

However, another point here is how the total number of bytes read is
represented with existing pg_stat_io columns 'reads' and 'op_bytes'.
It is known now with 'reads' * 'op_bytes', but with variable bytes,
how is read bytes calculated? Maybe add new columns
read_bytes/write_bytes?

> 2- Use op_bytes_[read | write | extend] columns instead of one
> op_bytes column, also use the first solution.
>
> This can solve the first solution's weakness but it introduces two
> more columns. This is more future proof compared to the first solution
> if there is a chance that some variable I/O could happen in write and
> extend calls as well in the future.

-1 as more columns impact the readability and usability.

> 3- Create a new pg_stat_io_wal view to put WAL I/Os here instead of pg_stat_io.
>
> pg_stat_io could remain untouchable and we will have flexibility to
> edit this new view as much as we want. But the original aim of the
> pg_stat_io is evaluating all I/O from a single view and adding a new
> view breaks this aim.

-1 as it defeats the very purpose of one-stop view pg_stat_io for all
kinds of I/O. PS: see my response above about adding both WAL write
I/O and read I/O stats to pg_stat_wal.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-05-13 14:13:20 Re: WAL_LOG CREATE DATABASE strategy broken for non-standard page layouts
Previous Message Tom Lane 2024-05-13 14:11:34 Re: Allowing additional commas between columns, and at the end of the SELECT clause