Re: Rework SLRU I/O errors handle

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Álvaro Herrera <alvherre(at)kurilemu(dot)de>
Cc: Maxim Orlov <orlovmg(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Rework SLRU I/O errors handle
Date: 2026-03-06 16:38:26
Message-ID: aa7908db-b88e-4bb6-a2bc-5b3cee9a28cb@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/03/2026 18:22, Heikki Linnakangas wrote:
> On 06/03/2026 17:33, Álvaro Herrera wrote:
>> I'm not a fan of the split.  I think it all these patches should be
>> pushed as a single commit, and avoid introducing xact_errmsg_for_io_error
>> as an exposed function.  I think that doesn't make a lot of sense.  Each
>> SLRU should have a correct and appropriate error reporting callback.
>
> Agreed.
>
>> The comment added in 0005 is bogus too.  It mentions
>> InvalidTransactionId,
>> but the problem is not that the value is 0 but rather that we get no
>> pointer.  Also, in all other callbacks the pointer is asserted to not be
>> NULL, so why don't we do the same here, and avoid an error message
>> that's not going to help anyone?  I see however that in the patch we're
>> passing a NULL to SlruReportIOError(), which means if you get an IO
>> error with any SLRU other than xact, you're going to get either a crash
>> on the assertion, or (on non-debug builds) a crash on dereferencing the
>> NULL pointer.
>
> Hmm, I thought we could just never pass a NULL pointer, but if a Write
> fails, slru.c has no context where to pull that opaque_data. Perhaps we
> should just not call the callback in that case.
>
> I'm starting to feel that what SlruReportIOError() currently uses as the
> errdetail, could well be the main error message, and the callback could
> provide the errdetail. I.e. swap the errmsg and errdetail we have now.
> That way, we can just leave out the errdetail for Write failures. The
> current errmsg we have for Write failures is pretty bad: "ERROR: could
> not access status of transaction 0". What's currently the errdetail,
> e.g. "Could not read from file \"%s\" at offset %d: %m", is a lot more
> informative.

Attached version that does that.

As another data point, we have a lot of error messages like "could not
open file \"%s\": %m" in the source tree. slru.c was the only place
where that was in the errdetail() part.

- Heikki

Attachment Content-Type Size
v2-0001-Add-a-callback-for-I-O-error-messages-in-SLRUs.patch text/x-patch 32.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2026-03-06 16:41:03 Re: Allow specifying NULL default in pg_proc.dat for "any" arguments
Previous Message Heikki Linnakangas 2026-03-06 16:22:42 Re: Rework SLRU I/O errors handle