Re: IPC/MultixactCreation on the Standby server

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Dmitry <dsy(dot)075(at)yandex(dot)ru>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: IPC/MultixactCreation on the Standby server
Date: 2025-07-29 18:15:28
Message-ID: 5E591CC6-1618-462F-9645-1E6C92934DC4@yandex-team.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 29 Jul 2025, at 12:17, Dmitry <dsy(dot)075(at)yandex(dot)ru> wrote:
>
> But on the master, some of the requests then fail with an error, apparently invalid multixact's remain in the pages.

Thanks!

That's a bug in my patch. I do not understand it yet. I've reproduced it with your original workload.
Most of errors I see are shallow (offset == 0 or nextOffset==0), but this one is interesting:

TRAP: failed Assert("shared->page_number[slotno] == pageno && shared->page_status[slotno] == SLRU_PAGE_WRITE_IN_PROGRESS"), File: "slru.c", Line: 729, PID: 91085
0 postgres 0x00000001032ea5ac ExceptionalCondition + 216
1 postgres 0x0000000102af2784 SlruInternalWritePage + 700
2 postgres 0x0000000102af14dc SimpleLruWritePage + 96
3 postgres 0x0000000102ae89d4 RecordNewMultiXact + 576

So it makes me think that it's some version of IO concurrency issue.
As expected error only persists if "extend SLRU" branch is active in RecordNewMultiXact().

Thanks for testing!

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-07-29 18:22:24 Re: pg_dump --with-* options
Previous Message Jeff Davis 2025-07-29 18:13:14 Re: pg_dump --with-* options