Re: Track IO times in pg_stat_io

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, smilingsamay(at)gmail(dot)com, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Track IO times in pg_stat_io
Date: 2023-03-21 02:34:51
Message-ID: 20230321023451.7rzy4kjj2iktrg2r@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-03-16 17:19:16 -0400, Melanie Plageman wrote:
> > > > I wonder if we should get rid of pgStatBlockReadTime, pgStatBlockWriteTime,
> > >
> > > And then have pg_stat_reset_shared('io') reset pg_stat_database IO
> > > stats?
> >
> > Yes.
>
> I think this makes sense but I am hesitant to do it in this patchset,
> because it feels a bit hidden...maybe?

I'd not do it in the same commit, but I don't see a problem with doing it in
the same patchset.

Now that I think about it again, this wouldn't make pg_stat_reset_shared('io')
affect pg_stat_database - I was thinking we should use pgstat_io.c stats to
provide the information for pgstat_database.c, using its own pending counter.

> > No, I don't think I am suggesting that. What I am trying to suggest is that
> > PendingIOStats should contain instr_time, but that PgStat_IO should contain
> > PgStat_Counter as microseconds, as before.
>
> So, I've modified the code to make a union of instr_time and
> PgStat_Counter in PgStat_BktypeIO. I am not quite sure if this is okay.
> I store in microsec and then in pg_stat_io, I multiply to get
> milliseconds for display.

Not a fan - what do we gain by having this union? It seems considerably
cleaner to have a struct local to pgstat_io.c that uses instr_time and have a
clean type in PgStat_BktypeIO. In fact, the code worked after just changing
that.

I don't think it makes sense to have pgstat_io_start()/end() as well as
pgstat_count_io*. For one, the name seems in a "too general namespace" - why
not a pgstat_count*?

> I considered refactoring pgstat_io_end() to use INSTR_TIME_ACCUM_DIFF()
> like [1], but, in the end I actually think I would end up with more
> operations because of the various different counters needing to be
> updated. As it is now, I do a single subtract and a few adds (one for
> each of the different statistics objects tracking IO times
> (pgBufferUsage, pgStatBlockWrite/ReadTime). Whereas, I would need to do
> an accum diff for every one of those.

Right - that only INSTR_TIME_ACCUM_DIFF() only makes sense if there's just a
single counter to update.

WRT:
/* TODO: AFAICT, pgstat_count_buffer_write_time is only called */
/* for shared buffers whereas pgstat_count_buffer_read_time is */
/* called for temp relations and shared buffers. */
/*
* is this intentional and should I match current behavior or
* not?
*/

It's hard to see how that behaviour could be intentional. Probably worth
fixing in a separate patch. I don't think we're going to backpatch, but it
would make this clearer nonetheless.

Incremental patch with some of the above changed attached.

Btw, it's quite nice how one now can attribute time more easily:

20 connections copying an 8MB file 50 times each:
SELECT reuses, evictions, writes, write_time, extends, extend_time FROM pg_stat_io WHERE backend_type = 'client backend' AND io_object = 'relation' AND io_context='bulkwrite';
┌────────┬───────────┬────────┬────────────┬─────────┬─────────────┐
│ reuses │ evictions │ writes │ write_time │ extends │ extend_time │
├────────┼───────────┼────────┼────────────┼─────────┼─────────────┤
│ 36112 │ 0 │ 36112 │ 141 │ 1523176 │ 8676 │
└────────┴───────────┴────────┴────────────┴─────────┴─────────────┘

20 connections copying an 80MB file 5 times each:
┌─────────┬───────────┬─────────┬────────────┬─────────┬─────────────┐
│ reuses │ evictions │ writes │ write_time │ extends │ extend_time │
├─────────┼───────────┼─────────┼────────────┼─────────┼─────────────┤
│ 1318539 │ 0 │ 1318539 │ 5013 │ 1523339 │ 7873 │
└─────────┴───────────┴─────────┴────────────┴─────────┴─────────────┘
(1 row)

Greetings,

Andres

Attachment Content-Type Size
v6-0001-Track-IO-times-in-pg_stat_io.patch text/x-diff 26.1 KB
v6-0002-pgstat-io-increment-pgstat_io-hackery.patch text/x-diff 13.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Luzanov 2023-03-21 03:37:22 Re: psql: Add role's membership options to the \du+ command
Previous Message Thomas Munro 2023-03-21 02:25:04 Re: Assertion failure with barriers in parallel hash join