| From: | Naga Appani <nagnrik(at)gmail(dot)com> |
|---|---|
| To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Xuneng Zhou <xunengzhou(at)gmail(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring |
| Date: | 2025-12-06 17:52:57 |
| Message-ID: | CA+QeY+CF2q2M51k5t4rZZ2SNeEq8ORf3Dr8T1+jm72VsHRJfjw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Ashutosh,
Thanks for the review!
I agree - comparing the exposed members_size against the documented
thresholds is sufficient for monitoring purposes.
This aligns with the approach taken in v11: exposing the current usage in
a way consistent with other PostgreSQL counters (e.g., XIDs, OIDs), without
introducing user-visible remaining-capacity calculations whose behavior is
inconsistent and difficult to interpret externally. In the same spirit, I
removed oldest_offset: as we discussed, it is internal and does not
provide an actionable signal to users.
If this addresses the concerns raised so far, I would appreciate
consideration in moving v11 forward for commit.
On Mon, Nov 10, 2025 at 12:13 AM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Wed, Nov 5, 2025 at 6:43 AM Naga Appani <nagnrik(at)gmail(dot)com> wrote:
> >
> > Understanding
> > ============
> > Based on reading the relevant parts of multixact.c and observing the runtime
> > behavior, both approaches seem to run into limitations when trying to derive a
> > “remaining members” value outside the backend. I may be missing details, but the
> > behavior I observed suggests that a reliable computation might require
> > duplicating
> > several internal mechanisms, including:
> > - wrap-aware offset comparison
> > - SLRU page and segment alignment rules
> > - SetOffsetVacuumLimit’s segment recalculation
> >
> > Without accounting for those, the derived numbers behaved inconsistently across
> > tests, sometimes staying at 0 until a large jump, and in other cases increasing
> > between exhaustion cycles. This seems broadly consistent with your concern that
> > simple arithmetic on these counters does not match how the backend determines
> > wraparound risk.
> >
> > To be clear, this interpretation is based only on what I could infer from the
> > code and testing, and I may not be capturing the entire picture. But from what I
> > observed, a user-visible “remaining members” metric does not seem
> > straightforward
> > without exposing or replicating backend logic.
>
> Right now MultiXactOffsetWouldWrap() assesses if the given distance is
> higher than the permitted distance between start and boundary. I think
> we could instead change it to report the permitted distance based on
> start and boundary; use it to report remaining space (after
> multiplying it with bytes per member) and also use it to assess
> whether the required distance is within that boundary or whether we
> need a warning. But ...
> On Sat, Oct 18, 2025 at 4:48 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> >
> > Thanks for working on this. I'm wondering if this is expected / could
> > help with monitoring for "space exhaustion" issues, which we currently
> > can't do easily, as it's not exposed anywhere.
> >
> > This is in multixact.c at line ~1177, where we do this:
> >
> > if (MultiXactState->oldestOffsetKnown &&
> > MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
> > nextOffset, nmembers))
> > {
> > ereport(ERROR, ...
> > }
> >
> > But I'm not sure the current patch exposes enough information to
> > calculate how much space remains - calculating that we requires
> > offsetStopLimit and nextOffset.
>
> The function exposes the number of existing members and the amount of
> space they consume (members_size). The documentation mentions space
> related thresholds 10GB and 20GB. Isn't comparing members_size to
> these thresholds enough to take appropriate action? If so, we could
> report the difference between these respective thresholds and
> members_size as a metric of space remaining before a given threshold
> is triggered.
Best regards,
Naga
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mihail Nikalayeu | 2025-12-06 18:16:00 | Re: Adding REPACK [concurrently] |
| Previous Message | 河田達也 | 2025-12-06 17:39:45 | [PATCH] Add sampling statistics to autoanalyze log output |