Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt
Date: 2026-03-31 09:15:59
Message-ID: 57a8ea70-32a8-4596-bb68-5d2990b83380@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

27.03.2026 21:30, Yura Sokolov пишет:
> Good day,
>
> 27.03.2026 12:04, Yura Sokolov wrote:
>> 26.03.2026 23:29, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
>>>> UnlockBufHdrExt does:
>>>>
>>>> buf_state |= set_bits;
>>>> buf_state &= ~unset_bits;
>>>> buf_state &= ~BM_LOCKED;
>>>>
>>>> TerminateBufferIO unconditionally does:
>>>>
>>>> unset_flag_bits |= BM_IO_ERROR;
>>>>
>>>> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
>>>> to set BM_IO_ERROR with call to TerminateBufferIO.
>>>>
>>>> It was found with proprietary code that was triggered on
>>>> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
>>>
>>> That's clearly not right. Care to write a patch? I think we should add a
>>> test for this in src/test/modules/test_aio too. As we don't rely on things
>>> like BM_IO_ERROR this is quite easy to not notice.
>> I thought, fix is too small to go the way of patches.
>> I don't mind if you just push it.
>>
>> But if I can save your time on writing test, I'll try.
>>
>> ...
>>
>> I believe, proper way is to add assertion in UnlockBufHdrExt:
>>
>> Assert(!(set_bits & unset_bits));
>>
>> And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
>> set_flag_bits.
>>
>> Is it ok?
>
> Here patchset is.
>
> I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
> failure.
>
> I've also added FlushBuffer failure test in second patch (v1-002) using
> injection point. I don't know if 001-aio.pl is a good place for since write
> is not async yet.
>
> v1-003 just mades DebugPrintBufferRefcount prettier.

rebased.

--
regards
Yura Sokolov aka funny-falcon

Attachment Content-Type Size
v2-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patch text/x-patch 5.7 KB
v2-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patch text/x-patch 7.7 KB
v2-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patch text/x-patch 4.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2026-03-31 09:17:16 Re: Eliminating SPI / SQL from some RI triggers - take 3
Previous Message Jelte Fennema-Nio 2026-03-31 09:09:59 Re: Make copyObject work in C++