|From:||Andres Freund <andres(at)anarazel(dot)de>|
|To:||Melanie Plageman <melanieplageman(at)gmail(dot)com>|
|Cc:||"Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, smilingsamay(at)gmail(dot)com, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>|
|Subject:||Re: Track IO times in pg_stat_io|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2023-04-07 12:17:38 -0400, Melanie Plageman wrote:
> Attached v9 addresses review feedback as well as resolving merge
> conflicts with recent relation extension patchset.
I've edited it a bit more:
- removed pgstat_tracks_io_time() and replaced it by returning the new
IO_COL_INVALID = -1 from pgstat_get_io_time_index() when there's no time
- moved PgStat_Counter count, time into the respective branches. It feels
somewhat wrong to access the time when we then decide there is no time.
- s/io_object/io_obj/ in pgstat_count_io_op_time(), combined with added
linebreaks, got the code to under 80 chars
- renamed pg_stat_microseconds_to_milliseconds to pg_stat_us_to_ms
- removed a spurious newline
- the times reported by pg_stat_io had their fractional part removed, due to
pg_stat_us_to_ms returning an integer
Verifying this, I saw that the write time visible in pg_stat_io didn't quite
match what I saw in log_checkpoints. But not always. Eventually I figured out
that that's not pg_stat_io's fault - log_checkpoint's write includes a lot of
things, including several other CheckPoint* routines, flushing WAL, asking the
kernel to flush things to disk... The biggest portion in my case were the
smgrwriteback() calls - which pg_stat_io doesn't track - oops.
Pushed up to and including 0003.
> I've changed pgstat_count_io_op_time() to take a count and call
> pgstat_count_io_op_n() so it can be used with smgrzeroextend(). I do
> wish that the parameter to pgstat_count_io_op_n() was called "count" and
> not "cnt"...
> I've also reordered the call site of pgstat_count_io_op_time() in a few
> locations, but I have some questions about this.
> Before, I didn't think it mattered much that we didn't finish counting
> IO time until after setting BM_VALID or BM_DIRTY and unsetting
> BM_IO_IN_PROGRESS. With the relation extension code doing this for many
> buffers at once, though, I wondered if this will make the IO timing too
> As such, I've moved pgstat_count_io_op_time() to before we set those
> flags in all locations. I did wonder if it is bad to prolong having the
> buffer pinned and not having those flags set, though.
I went back and forth about this before. I think it's ok the way you did it.
I think 0004 needs a bit more work. At the very least we would have to swap
the order of pgstat_flush_pending_entries() and pgstat_flush_io() - entirely
doable. Unlike 0003, this doesn't make pg_stat_io more complete, or such, so
I'm inclined to leave it for 17. I think there might be some more
opportunities for having counts "flow down", like the patch does.
|Next Message||Tom Lane||2023-04-08 00:14:33||Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?|
|Previous Message||Michael Paquier||2023-04-08 00:05:10||Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?|