| From: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt |
| Date: | 2026-03-31 09:15:59 |
| Message-ID: | 57a8ea70-32a8-4596-bb68-5d2990b83380@postgrespro.ru |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
27.03.2026 21:30, Yura Sokolov пишет:
> Good day,
>
> 27.03.2026 12:04, Yura Sokolov wrote:
>> 26.03.2026 23:29, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
>>>> UnlockBufHdrExt does:
>>>>
>>>> buf_state |= set_bits;
>>>> buf_state &= ~unset_bits;
>>>> buf_state &= ~BM_LOCKED;
>>>>
>>>> TerminateBufferIO unconditionally does:
>>>>
>>>> unset_flag_bits |= BM_IO_ERROR;
>>>>
>>>> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
>>>> to set BM_IO_ERROR with call to TerminateBufferIO.
>>>>
>>>> It was found with proprietary code that was triggered on
>>>> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
>>>
>>> That's clearly not right. Care to write a patch? I think we should add a
>>> test for this in src/test/modules/test_aio too. As we don't rely on things
>>> like BM_IO_ERROR this is quite easy to not notice.
>> I thought, fix is too small to go the way of patches.
>> I don't mind if you just push it.
>>
>> But if I can save your time on writing test, I'll try.
>>
>> ...
>>
>> I believe, proper way is to add assertion in UnlockBufHdrExt:
>>
>> Assert(!(set_bits & unset_bits));
>>
>> And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
>> set_flag_bits.
>>
>> Is it ok?
>
> Here patchset is.
>
> I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
> failure.
>
> I've also added FlushBuffer failure test in second patch (v1-002) using
> injection point. I don't know if 001-aio.pl is a good place for since write
> is not async yet.
>
> v1-003 just mades DebugPrintBufferRefcount prettier.
rebased.
--
regards
Yura Sokolov aka funny-falcon
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patch | text/x-patch | 5.7 KB |
| v2-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patch | text/x-patch | 7.7 KB |
| v2-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patch | text/x-patch | 4.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Langote | 2026-03-31 09:17:16 | Re: Eliminating SPI / SQL from some RI triggers - take 3 |
| Previous Message | Jelte Fennema-Nio | 2026-03-31 09:09:59 | Re: Make copyObject work in C++ |