Re: pg_stat_io_histogram

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_io_histogram
Date: 2026-02-26 16:13:41
Message-ID: rwnnc3hani7553k5ahyw3gxx2db2cntqn3dpzs7qv7nf6ehmep@lzkxod2s7wk5
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-02-23 13:30:44 +0100, Jakub Wartak wrote:
> > > but I think having it in PgStat_BktypeIO is not great. This makes
> > > PgStat_IO 30k*BACKEND_NUM_TYPES bigger, or ~ 0.5MB. Having a stats snapshot
> > > be half a megabyte bigger for no reason seems too wasteful.
> >
> > Yea, that's not awesome.
>
> Guys, question, could You please explain me what are the drawbacks of having
> this semi-big (internal-only) stat snapshot of 0.5MB? I'm struggling to
> understand two things:
> a) 0.5MB is not a lot those days (ok my 286 had 1MB in the day ;))

I don't really agree with that, I guess. And even if I did, it's one thing to
use 0.5MB when you actually use it, it's quite another when most of that
memory is never used.

With the patch, *every* backend ends up with a substantially larger
pgStatLocal. Before:

nm -t d --size-sort -r -S src/backend/postgres|head -n20|less
(the second column is the decimal size, third the type of the symbol)

0000000004131808 0000000000297456 r yy_transition
...
0000000003916352 0000000000054744 r UnicodeDecompMain
0000000021004896 0000000000052824 B pgStatLocal
0000000003850592 0000000000040416 r unicode_categories
...

after:
0000000023220512 0000000000329304 B pgStatLocal
0000000018531648 0000000000297456 r yy_transition
...

And because pgStatLocal is zero initialized data, it'll be on-demand-allocated
in every single backend (whereas e.g. yy_transition is read-only shared). So
you're not talking a single time increase, you're multiplying it by the numer
of active connections

Now, it's true that most backend won't ever touch pgStatLocal. However, most
backends will touch Pending[Backend]IOStats, which also increased noticably:

before:
0000000021060960 0000000000002880 b PendingIOStats
0000000021057792 0000000000002880 b PendingBackendStats

after:
0000000023568416 0000000000018240 b PendingIOStats
0000000023549888 0000000000018240 b PendingBackendStats

Again, I think some increase here doesn't have to be fatal, but increasing
with mainly impossible-to-use memory seems just too much waste to mee.

This also increases the shared-memory usage of pgstats: Before it used ~300kB
on a small system. That nearly doubles with this patch. But that's perhaps
less concerning, given it's per-system, rather than per-backend memory usage.

> b) how does it affect anything, because testing show it's not?

Which of your testing would conceivably show the effect? The concern here
isn't really performance, it's that it increases our memory usage, which you'd
only see having an effect if you are tight on memory or have a workload that
is cache sensitive.

> My understandiung is that it only affects file size on startup/shutdown
> in $PGDATADIR/pgstat/pgstat.stat, correct? My worry is that we introduce
> more code (and bugs) for no real gain (?)

that part is kind of irrelevant compared to the actual increase in memory
usage IMO.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-26 16:14:24 Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Previous Message Álvaro Herrera 2026-02-26 16:01:57 Re: More speedups for tuple deformation