Re: IPC/MultixactCreation on the Standby server

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Dmitry Yurichev <dsy(dot)075(at)yandex(dot)ru>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ivan Bykov <i(dot)bykov(at)modernsys(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Subject: Re: IPC/MultixactCreation on the Standby server
Date: 2025-12-04 14:17:53
Message-ID: d9996478-389a-4340-8735-bfad456b313c@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/12/2025 19:19, Heikki Linnakangas wrote:
> On 03/12/2025 16:19, Dmitry Yurichev wrote:
>> On 12/2/25 16:48, Heikki Linnakangas wrote:
>>> Thanks! Agreed, v14-16 were the same. v17 and v18 might be worth
>>> testing separately, to make sure I didn't e.g. screw up the locking
>>> differences.
>>
>> I tested on the REL_18_1 tag (with applying v15-pg18-0001-Set-next-
>> multixid-s-offset-when-creating-a-.patch).
>> I didn't notice any deadlocks or other errors. Thanks!
>
> Okay. I fixed a few more little things:
>
> - fixed the warnings about shadowed variables that Andrey pointed out
> - in older branches before we switched to 64-bit SLRU page numbers, use
> 'int' rather than 'int64' in page number variables
> - improve wording and fix few trivial typos in comments
>
> Committed with those last-minute changes. Thanks for the testing!

While working on the 64-bit multixid offsets patch, I noticed one more
bug with this. At offset wraparound, when we set the next multixid's
offset in RecordNewMultiXact, we incorrectly set it to 0 instead of 1.
We're supposed to skip over offset 1, because 0 is reserved to mean
invalid. We do that correctly when setting the "current" multixid's
offset, because the caller of RecordNewMultiXact has already skipped
over offset 0, but I missed it for the next offset.

I was able to reproduce that with these steps:

pg_resetwal -O 0xffffff00 -D data
pg_ctl -D data start
pgbench -s5 -i postgres
pgbench -c3 -t100 -f a.sql postgres

a.sql:
select * from pgbench_branches FOR KEY SHARE;

You get an error like this:

pgbench: error: client 2 script 0 aborted in command 0 query 0: ERROR:
MultiXact 372013 has invalid next offset

I tried to modify the new wraparound TAP test to reproduce that, but it
turned out to be difficult because you need to have multiple backends
assigning multixids concurrently to hit that.

The fix is pretty straightforward, see attached. Barring objections,
I'll commit and backport this.

- Heikki

Attachment Content-Type Size
0001-Fix-setting-next-multixid-s-offset-at-offset-wraparo.patch text/x-patch 1.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2025-12-04 14:35:31 Re: Proposal: Conflict log history table for Logical Replication
Previous Message Bertrand Drouvot 2025-12-04 13:51:31 Re: Remove unnecessary casts in printf format arguments