Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

From: Lukas Fittl <lukas(at)fittl(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date: 2022-09-30 23:17:25
Message-ID: CAP53PkyL5+h0NfKsY-XQ7fq9ha+37G5gd5vhZYtb6QQyJnvHdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 27, 2022 at 11:20 AM Melanie Plageman <melanieplageman(at)gmail(dot)com>
wrote:

> v30 attached
> rebased and pgstat_io_ops.c builds with meson now
> also, I tested with pgstat_report_stat() only flushing when forced and
> tests still pass
>

First of all, I'm excited about this patch, and I think it will be a big
help to understand better which part of Postgres is producing I/O (and why).

I've paired up with Maciek (CCed) on a review of this patch and had a few
comments, focused on the user experience:

The term "strategy" as an "io_context" is hard to understand, as its not a
concept an end-user / DBA would be familiar with. Since this comes from
BufferAccessStrategyType (i.e. anything not NULL/BAS_NORMAL is treated as
"strategy"), maybe we could instead split this out into the individual
strategy types? i.e. making "strategy" three different I/O contexts
instead: "shared_bulkread", "shared_bulkwrite" and "shared_vacuum",
retaining "shared" to mean NULL / BAS_NORMAL.

Separately, could we also track buffer hits without incurring extra
overhead? (not just allocs and reads) -- Whilst we already have shared read
and hit counters in a few other places, this would help make the common
"What's my cache hit ratio" question more accurate to answer in the
presence of different shared buffer access strategies. Tracking hits could
also help for local buffers (e.g. to tune temp_buffers based on seeing a
low cache hit ratio).

Additionally, some minor notes:

- Since the stats are counting blocks, it would make sense to prefix the
view columns with "blks_", and word them in the past tense (to match
current style), i.e. "blks_written", "blks_read", "blks_extended",
"blks_fsynced" (realistically one would combine this new view with other
data e.g. from pg_stat_database or pg_stat_statements, which all use the
"blks_" prefix, and stop using pg_stat_bgwriter for this which does not use
such a prefix)

- "alloc" as a name doesn't seem intuitive (and it may be confused with
memory allocations) - whilst this is already named this way in
pg_stat_bgwriter, it feels like this is an opportunity to eventually
deprecate the column there and make this easier to understand -
specifically, maybe we can clarify that this means buffer *acquisitions*?
(either by renaming the field to "blks_acquired", or clarifying in the
documentation)

- Assuming we think this view could realistically cover all I/O produced by
Postgres in the future (thus warranting the name "pg_stat_io"), it may be
best to have an explicit list of things that are not currently tracked in
the documentation, to reduce user confusion (i.e. WAL writes are not
tracked, temporary files are not tracked, and some forms of direct writes
are not tracked, e.g. when a table moves to a different tablespace)

- In the view documentation, it would be good to explain the different
values for "io_strategy" (and what they mean)

- Overall it would be helpful if we had a dedicated documentation page on
I/O statistics that's linked from the pg_stat_io view description, and
explains how the I/O statistics tie into the various concepts of shared
buffers / buffer access strategies / etc (and what is not tracked today)

Thanks,
Lukas

--
Lukas Fittl

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-09-30 23:18:34 Re: predefined role(s) for VACUUM and ANALYZE
Previous Message Zhihong Yu 2022-09-30 23:11:09 Re: [BUG] parenting a PK constraint to a self-FK one (Was: Self FK oddity when attaching a partition)