Re: faulty error handling around pgstat_count_io_op_time()

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: faulty error handling around pgstat_count_io_op_time()
Date: 2026-06-17 00:26:22
Message-ID: ajHprjZERUqMP2Kj@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 15, 2026 at 06:26:00AM +0000, Bertrand Drouvot wrote:
> On Fri, Jun 12, 2026 at 06:01:45PM +0200, Peter Eisentraut wrote:
>> There are several places where the return value of pg_pread() or pg_pwrite()
>> is passed directly as the byte count to pgstat_count_io_op_time(). The
>> bytes argument of pgstat_count_io_op_time() is of type uint64, and so error
>> returns of -1 are going to passed as UINT64_MAX and added as such to the
>> internal statistics.
>
> Nice catch!

This thread has slipped through, and it looks like I'm involved as of
a051e71e28a1. (Please feel free to add me in CC in such cases.)

>> I think the correction here would be to move the pgstat_count_io_op_time()
>> calls to after the error returns are handled. This is effectively how most
>> other code already behaves. For example, most smgr calls don't return on
>> error, so you don't get a chance to make any pgstat calls afterwards. It's
>> only the open-coded places where we can even do that.
>
> Sounds reasonable to me and done that way in the attached.

In the xlogrecovery.c case, we should care about the short read case.
What you are doing here looks OK for this path.

In XLogFileInitInternal(), the first pgstat_count_io_op_time() is not
completely right, no? pg_pwrite_zeros() or pg_pwrite() could fail,
and it does not make sense to me to count data if we have a
save_errno, and the files are unlinked in the error path. I'd propose
to delay the count() call to happen after the error check is done.

This leads me to the v2 attached. This is your v1 plus the extra
change for XLogFileInitInternal() when the segments are initialized.
--
Michael

Attachment Content-Type Size
v2-0001-Fix-pgstat_count_io_op_time-calls-passing-error-r.patch text/plain 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-06-17 00:34:22 Re: IGNORE/RESPECT NULLS can be specified for (prokind == 'f').
Previous Message Fujii Masao 2026-06-17 00:12:33 Re: Fix ALTER DOMAIN VALIDATE CONSTRAINT locking