Re: pg_stat_io_histogram

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Ants Aasma <ants(dot)aasma(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_stat_io_histogram
Date: 2026-02-23 12:35:38
Message-ID: CAKZiRmypQTu0r99koT-ihajfAb-vn3CNwExoG4jkSznBuMN7CQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 19, 2026 at 7:12 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2026-02-19 19:55:06 +0200, Ants Aasma wrote:
> > > Right now the lowest bucket is for 0-8 ms, the second for 8-16, the third for
> > > 16-32. I.e. the first bucket is the same width as the second. Is that
> > > intentional?
> >
> > If the boundaries are not on power-of-2 calculating the correct bucket
> > would take a bit longer.
>
> Powers of two make sense, my point was that the lowest bucket and the next
> smallest one are *not* sized in a powers of two fashion, unless I miss
> something?

Yes, as stated earlier it's intentionally made flat at the beggining to be able
to differentiate those fast accesses.

> > For reducing the number of buckets one option is to use log base-4 buckets
> > instead of base-2.
>
> Yea, that could make sense, although it'd be somewhat sad to lose that much
> precision.

Same here, as stated earlier I wouldn't like to loose this precision.

> > But if we are worried about the size, then reducing the number of histograms
> > kept would be better.
>
> I think we may want both.

+1.

> > Many of the combinations are not used at all

This!

> Yea, and for many of the operations we will never measure time and thus will
> never have anything to fill the histogram with.
>
> Perhaps we need to do something like have an array of histogram IDs and then a
> smaller number of histograms without the same indexing. That implies more
> indirection, but I think that may be acceptable - the overhead of reading a
> page are high enough that it's probably fine, whereas a lot more indirection
> for something like a buffer hit is a different story.

OK so the previous options from the thread are:
a) we might use uint32 instead of uint64 and deal with overflows
b) we might filter some out of in order to save some memory. Trouble would be
which ones to eliminate... and would e.g. 2x saving be enough?
c) we leave it as it is (accept the simple code/optimal code and waste
this ~0.5MB
pgstat.stat)
d) the above - but I hardly understood how it would look like at all
e) eliminate some precision (via log4?) or column (like context/) - IMHO we
would waste too much precision or orginal goals with this.

So I'm kind of lost how to progress this, because now I - as previously stated -
I do not understand this challenge with memory saving and do now know the aim
or where to stop this optimization, thus I'm mostly +1 for "c", unless somebody
Enlighten me, please ;)

> > and for normal use being able to distinguish latency profiles between so
> > many different categories is not that useful.
>
> I'm not that convinced by that though. It's pretty useful to separate out the
> IO latency for something like vacuuming, COPY and normal use of a
> relation. They will often have very different latency profiles.

+1

--

Anyway, I'm attaching v6 - no serious changes, just cleaning:

1. Removed dead ifdefed code (finding most siginificant bits) as testing by Ants
showed that CLZ has literally zero overhead.
2. Rebased and fixed some missing include for ports/bits header for
pg_leading_zero_bits64(), dunno why it didnt complain earlier.
3. Added Ants as reviewer.
4. Fixed one comment refering to wrong function (nearby enum hist_io_stat_col).
5. Added one typedef to src/tools/pgindent/typedefs.list.

-J.

Attachment Content-Type Size
v6-0001-Add-pg_stat_io_histogram-view-to-provide-more-det.patch text/x-patch 34.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anders Åstrand 2026-02-23 12:37:35 Re: [patch] Add support for connection strings to createuser and dropuser
Previous Message Jakub Wartak 2026-02-23 12:30:44 Re: pg_stat_io_histogram