Re: POC: make mxidoff 64 bits

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Maxim Orlov <orlovmg(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: make mxidoff 64 bits
Date: 2025-11-21 13:56:33
Message-ID: 6c298bc4-7029-4c1d-bf16-3e094842ce32@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/11/2025 14:15, Ashutosh Bapat wrote:
> On Mon, Nov 17, 2025 at 10:05 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> Ashutosh, you were interested in reviewing this earlier. Would you have
>> a chance to review this now, before I commit it?
>
> 0002 seems to be fine. It's just moving code from one file to another.
> However, the name multixact_internals.h seems to be misusing term
> "internal". I would expect an "internal.h" to expose things for use
> within the module and not "externally" in other modules. But this
> isn't the first time, buf_internal.h, toast_internal.h
> bgworker_internal.h and so on have their own meaning of what
> "internal" means to them. But it might be better to use something more
> meaningful than "internal" in this case. The file seems to contain
> declarations related to how pg_multixact SLRU is structured. To that
> effect multixact_slru.h or multixact_layout.h may be more appropriate.

Yeah, I went with multixact_internal.h because of the precedence. It's
not great, but IMHO it's better to be consistent than invent a new
naming scheme.

> There are two offsets that that file deals with offset within
> pg_multixact/offset, MultiXactOffset and member offset (flag offset
> and xid offset) within pg_multixact/members. It's quite easy to get
> confused between those when reading that code.

Agreed, those are confusing. I'll think about that a little more.

> The reason why this eliminates the need for wraparound is mentioned
> somewhere in GetNewMultiXactId(), but probably it should be mentioned
> at a more prominent place and also in the commit message. I expected
> it to be in the prologue of GetNewMultiXactId(), and then a reference
> to prologue from where the comment is right now.
>
> ereport(ERROR,
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> errmsg("MultiXact members would wrap around")));
>
> If a server ever reaches this point, there is no way but to create
> another cluster, if the applications requires multi-xact ids?

Pretty much. You can also vacuum freeze everything, so that no multixids
are in use, and then use pg_resetwal to reset nextOffset to a lower value.

That obviously sounds bad, but this is a "can't happen" situation. For
comparison, we don't provide any better way to recover from running out
of LSNs either.

> We should also provide this as an errhint().
I think no. You cannot run into this "organically" by just running the
system. If you do manage to hit it, it's a sign of some other trouble,
and I don't want to guess what that might be, or what might be the
appropriate way to recover.

> In ExtendMultiXactMember() the last comment mentions a wraparound
> /*
> * Advance to next page, taking care to properly handle the wraparound
> * case. OK if nmembers goes negative.
> */
> I thought this wraparound is about offset wraparound, which can not
> happen now. Since you have left the comment intact, it's something
> else. Is the wraparound of offset within the page? Maybe requires a
> bit more clarification?

It was indeed about offset wraparound. I'll remove it.

> PerformMembersTruncation(MultiXactOffset oldestOffset, MultiXactOffset
> newOldestOffset)
> {
> ... snip ...
> - segment += 1;
> - }
> + SimpleLruTruncate(MultiXactMemberCtl,
> + MXOffsetToMemberPage(newOldestOffset));
> }
>
> Most of the callers of SimpleLruTruncate() call it directly. Why do we
> want to keep this static wrapper? PerformOffsetsTruncation() has a
> comment which seems to need the wrapper. But
> PerformMembersTruncation() doesn't even have that.

Hmm, yeah those wrappers are a bit vestigial now. I'm inclined to keep
them, because as you said, PerformOffsetsTruncation() provides a place
for the comment. And given that, it seems best to keep
PerformMembersTruncation(), for symmetry.

> MultiXactState->oldestMultiXactId = newOldestMulti;
> MultiXactState->oldestMultiXactDB = newOldestMultiDB;
> + MultiXactState->oldestOffset = newOldestOffset;
> LWLockRelease(MultiXactGenLock);
>
> Is this something we are missing in the current code? I can not
> understand the connection between this change and the fact that offset
> wraparound is not possible with wider multi xact offsets. Maybe I
> missed some previous discussion.

Good question. At first I intended to extract that to a separate commit,
before the main patch, because it seems like a nice improvement: We have
just calculated 'oldestOffset', so why not update the value in shared
memory while we have it? But looking closer, I'm not sure if it'd be
sane without the 64-bit offsets. Currently, oldestOffset is only updated
by SetOffsetVacuumLimit(), which also updates offsetStopLimit. We could
get into a state where oldestOffset is set, but offsetStopLimit is not.
With 64-bit offsets that's no longer a concern because it removes
offsetStopLimit altogether.

> I have reviewed patch 0002 and multxact.c changes in 0003. So far I
> have only these comments. I will review the pg_upgrade.c changes next.

Thanks for the review so far!

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2025-11-21 14:03:15 Re: more C99 cleanup
Previous Message Peter Eisentraut 2025-11-21 13:50:03 more C99 cleanup