Re: IPC/MultixactCreation on the Standby server

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Ivan Bykov <i(dot)bykov(at)modernsys(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Álvaro Herrera <alvherre(at)kurilemu(dot)de>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>
Subject: Re: IPC/MultixactCreation on the Standby server
Date: 2025-11-25 09:57:39
Message-ID: 1b39e3d3-2e69-4762-a583-4692f1fbee9a@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This approach makes a lot of sense to me.

I think there's one more case that this fixes: If someone uses
"pg_resetwal -m ..." to advance nextMulti, that's yet another case where
the last multixid becomes unreadable because we haven't set the next
mxid's offset in the SLRU yet. People shouldn't be running pg_resetwal
unnecessarily, but the docs for pg_resetwal include instructions on how
to determine a safe value for nextMulti. You will in fact lose the last
multixid if you follow the instructions, so it's not as safe as it
sounds. This patch fixes that issue too.

Álvaro, are you planning to commit this? I've been looking at this code
lately for the 64-bit mxoffsets patch, so I could also pick it up if
you'd like.

> @@ -1383,7 +1417,10 @@ GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **members,
> * 1. This multixact may be the latest one created, in which case there is
> * no next one to look at. In this case the nextOffset value we just
> * saved is the correct endpoint.
> + * TODO: how does it work on Standby? MultiXactState->nextMXact does not seem to be up-to date.
> + * nextMXact and nextOffset are in sync, so nothing bad can happen, but nextMXact seems mostly random.

That's scary. AFAICS MultiXactState->nextMXact should be up-to-date in
hot standby mode. Are we missing MultiXactAdvanceNextMXact() calls
somewhere, or what's going on?

The case 1. is still valid, you can indeed just look at the saved
nextOffset. But now the next offset should also be set on the SLRU,
except right after upgrading to a version that has this patch. So I
guess we still need to have this special case to deal with upgrades.

> *
> + * THIS IS NOT POSSIBLE ANYMORE, KEEP IT FOR HISTORIC REASONS.
> * 2. The next multixact may still be in process of being filled in: that
> * is, another process may have done GetNewMultiXactId but not yet written
> * the offset entry for that ID. In that scenario, it is guaranteed that

This comment needs to be cleaned up to explain how all this works now...

> @@ -1138,8 +1168,13 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
> result = FirstMultiXactId;
> }
>
> - /* Make sure there is room for the MXID in the file. */
> - ExtendMultiXactOffset(result);
> + /*
> + * Make sure there is room for the MXID and next offset in the file.
> + * We might overflow to the next segment, but we don't need to handle
> + * FirstMultiXactId specifically, because ExtendMultiXactOffset handles
> + * both cases well: 0 offset and FirstMultiXactId would create segment.
> + */
> + ExtendMultiXactOffset(MultiXactState->nextMXact + 1);
>
> /*
> * Reserve the members space, similarly to above. Also, be careful not to

Does this create the file correctly, if you upgrade the binary to a new
version that contains this patch, and nextMXact was at a segment
boundary before the upgrade?

> @@ -2487,10 +2509,11 @@ ExtendMultiXactOffset(MultiXactId multi)
>
> /*
> * No work except at first MultiXactId of a page. But beware: just after
> - * wraparound, the first MultiXactId of page zero is FirstMultiXactId.
> + * wraparound, the first MultiXactId of page zero is FirstMultiXactId,
> + * make sure we are not in that case.
> */
> - if (MultiXactIdToOffsetEntry(multi) != 0 &&
> - multi != FirstMultiXactId)
> + Assert(multi != FirstMultiXactId);
> + if (MultiXactIdToOffsetEntry(multi) != 0)
> return;
>
> pageno = MultiXactIdToOffsetPage(multi);

I don't quite understand this change. I guess the point is that the
caller now never calls this with FirstMultiXactId, but it feels a bit
weird to assert and rely on that here.

I didn't look at the test changes yet.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2025-11-25 10:04:36 Re: ON CONFLICT DO SELECT (take 3)
Previous Message David Geier 2025-11-25 09:41:54 Re: Remove useless casting to the same type