Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Xuneng Zhou <xunengzhou(at)gmail(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Naga Appani <nagnrik(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Date: 2025-10-18 11:17:57
Message-ID: 95850ce1-2d5e-4271-92ea-c2a02e36b303@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for working on this. I'm wondering if this is expected / could
help with monitoring for "space exhaustion" issues, which we currently
can't do easily, as it's not exposed anywhere.

This is in multixact.c at line ~1177, where we do this:

if (MultiXactState->oldestOffsetKnown &&
MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
nextOffset, nmembers))
{
ereport(ERROR, ...
}

But I'm not sure the current patch exposes enough information to
calculate how much space remains - calculating that we requires
offsetStopLimit and nextOffset.

The stopLimit could be calculated from oldest_offset, which the patch
returns. It's not quite trivial. It depends on BLCKSZ through
MULTIXACT_MEMBERS_PER_PAGE, and various other internal constants. It's
tempting to hardcode those into monitoring scripts, which then gets
broken in subtle ways with custom builds or if we change something
(which for multixacts we can).

And I don't think the patch exposes nextOffset, right? So AFAICS we
can't actually calculate the remaining space.

Could it either return nextOffset, or maybe actually calculate and
return the remaining space? And perhaps the "total" space, so that it's
possible to calculate what fraction of the space we already consumed.

I'm actually not entirely convinced we should be exposing the raw
internal information this patch aims to expose. Because a lot of that
feels like an internal implementation detail, and it's going to be hard
to interpret ....

Knowing num_mxids / num_members or members_size is nice, but how would
I judge how far the system is from hitting some threshold or hard limit?
Is there some maximum number of mxids/members that we could return? Or
something like that?

Similarly for oldest_multixact / oldest_offset. How useful is that
without knowing the "next" value for each of those?

Or am I missing something obvious?

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mankirat Singh 2025-10-18 13:04:17 Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()
Previous Message Kirill Reshke 2025-10-18 08:59:40 Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array