Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date: 2022-10-26 18:58:08
Message-ID: 20221026185808.4qnxowtn35x43u7u@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-10-24 14:38:52 -0400, Melanie Plageman wrote:
> > - "repossession" is a very unintuitive name for me. If we want something like
> > it, can't we just name it reuse_failed or such?
>
> Repossession could be called eviction_failed or reuse_failed.
> Do we think we will ever want to use it to count buffers we released
> in other IOContexts (thus making the name eviction_failed better than
> reuse_failed)?

I've a somewhat radical proposal: Let's just not count any of this in the
initial version. I think we want something, but clearly it's one of the harder
aspects of this patch. Let's get the rest in, and then work on this is in
isolation.

> Speaking of IOCONTEXT_LOCAL, I was wondering if it is confusing to call
> it IOCONTEXT_LOCAL since it refers to IO done for temporary tables. What
> if, in the future, we want to track other IO done using data in local
> memory?

Fair point. However, I think 'tmp' or 'temp' would be worse, because there's
other sources of temporary files that would be worth counting, consider
e.g. tuplestore temporary files. 'temptable' isn't good because it's not just
tables. 'temprel'? On balance I think local is better, but not sure.

> Also, what if we want to track other IO done using data from shared memory
> that is not in shared buffers? Would IOCONTEXT_SB and IOCONTEXT_TEMP be
> better? Should IOContext literally describe the context of the IO being done
> and there be a separate column which indicates the source of the data for
> the IO? Like wal_buffer, local_buffer, shared_buffer? Then if it is not
> block-oriented, it could be shared_mem, local_mem, or bypass?

Hm. I don't think we'd need _buffer for WAL or such, because there's nothing
else.

> If we had another dimension to the matrix "data_src" which, with
> block-oriented IO is equivalent to "buffer type", this could help with
> some of the clarity problems.
>
> We could remove the "reused" column and that becomes:
>
> IOCONTEXT | DATA_SRC | IOOP
> ----------------------------------------
> strategy | strategy_buffer | EVICT

> Having data_src and iocontext simplifies the meaning of all io
> operations involving a strategy. Some operations are done on shared
> buffers and some on existing strategy buffers and this would be more
> clear without the addition of special columns for strategies.

-1, I think this just blows up the complexity further, without providing much
benefit. But:

Perhaps a somewhat similar idea could be used to address the concerns in the
preceding paragraphs. How about the following set of columns:

backend_type:
object: relation, temp_relation[, WAL, tempfiles, ...]
iocontext: buffer_pool, bulkread, bulkwrite, vacuum[, bypass]
read:
written:
extended:
bytes_conversion:
evicted:
reused:
files_synced:
stats_reset:

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message samay sharma 2022-10-26 19:23:32 Re: Documentation for building with meson
Previous Message Melanie Plageman 2022-10-26 17:54:44 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)