Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Naga Appani <nagnrik(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Date: 2025-09-05 11:27:34
Message-ID: CAExHW5tDxjXKcP0XuUZXb_UbpAs_oQ29HyOtvL0xf7dCkk5ypw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 4, 2025 at 2:41 AM Naga Appani <nagnrik(at)gmail(dot)com> wrote:
>
> On Fri, Aug 22, 2025 at 6:45 AM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Fri, Aug 22, 2025 at 7:37 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> > >
> Updated docs to include both counts and approximate storage.
>

This one is remaining.
+ up to approximately 2^32 entries before reaching wraparound.

... 2^32 entries (occupying roughly 20GB in the
<literal>pg_multixact/members</literal> directory) before reaching
wraparound. ...

+ See <xref linkend="vacuum-for-multixact-wraparound"/> for further
details on multixact wraparound.

I don't think we need this reference here. Reference back from that
section is enough.

+ * Returns NULL if the oldest referenced offset is unknown, which
happens during
+ * system startup or when no MultiXact references exist in any relation.

If no MultiXact references exist, and GetMultiXactInfo() returns
false, MultiXactMemberFreezeThreshold() will assume the worst, which I
take as meaning that it will trigger aggressive autovacuum. No
MultiXact references existing is a common case which shouldn't be
assumed as the worst case. The comment I quoted means "the oldest
value of the offset referenced by any multi-xact referenced by a
relation *may not be always known". You seem to have interpreted "may
not be known" as "does not exist" That's not right. I would write this
as "Returns NULL if the oldest referenced offset is unknown which
happens during system startup".

Similarly I would rephrase the following docs as
+ <para>
+ The function returns <literal>NULL</literal> when multixact
statistics are unavailable.
+ For example, during startup before multixact initialization completes or when
+ the oldest member offset cannot be determined.

"The function returns <literal>NULL</literal> when multixact
statistics when the oldest multixact offset corresponding to a
multixact referenced by a relation is not known after starting the
system."

> >
> > @@ -0,0 +1,127 @@
> > +# High-signal invariants for pg_get_multixact_stats()

What does "High-signal" mean here? Is that term defined somewhere?
Using terms that most of the contributors are familiar with improves
readability. If a new term is required, it needs to be defined first.
But I doubt something here requires defining a new term.

> > What's a driver transaction?
> A driver transaction is simply the controlling session that stays open
> while snapshots are taken.

I still don't understand the purpose of this transaction.
pg_get_multixact_stats() isn't transactional so the driver transaction
isn't holding any "snapshot" of the stats. It's also not creating any
multixact and hence does not contribute to testing the output of
pg_get_multixact_stats. Whatever this session is doing, can be done
outside a transaction too. Which step in this session requires an
outer transaction?

Some more comments
+ Returns statistics about current multixact usage:
+ <literal>num_mxids</literal> is the number of multixact IDs assigned,

Is this the number of multixact IDs assigned till now (since whatever
time) or the number of multixact IDs currently in the system?

+ <literal>num_members</literal> is the number of multixact member
entries created,

Similarly this.

+ multixact allocation and usage patterns in real time. For example:

suggestion: ... real time, for example: ... Otherwise the sentence
started by "For example" is not a complete sentence.

+ Returns statistics about current multixact usage:
+ <literal>num_mxids</literal> is the number of multixact IDs assigned,

Is this the number of multixact IDs assigned till now (since whatever
time) or the number of multixact IDs currently in the system?

+ <literal>num_members</literal> is the number of multixact member
entries created,

Similarly this.

+ multixact allocation and usage patterns in real time. For example:

suggestion: ... real time, for example: ... Otherwise the sentence
started by "For example" is not a complete sentence.

+
+ values[0] = Int32GetDatum(multixacts);

This should be UInt32GetDatum() multixacts is uint32.

+ values[1] = Int64GetDatum(members);

Similarly this since MultiXactOffset is uint32.

+ values[4] = Int64GetDatum(oldestOffset);

Similarly this since MultiXactOffset is uint32.

+# Get MultiXact state
+{
+ oid => '9001',
+ descr => 'get current multixact member and multixact ID counts and
oldest values',

suggestion: get current multixact usage statistics.

+ proname => 'pg_get_multixact_stats',
+ prorettype => 'record',
+ proargtypes => '',
+ proallargtypes => '{int4,int8,int8,xid,int8}',
+ proargmodes => '{o,o,o,o,o}',
+ proargnames =>
'{num_mxids,num_members,members_size,oldest_multixact,oldest_offset}',
+ provolatile => 'v',
+ proparallel => 's',
+ prosrc => 'pg_get_multixact_stats'
+},

I like the way you have formatted the new entry, but other entries in
this file are not formatted this way. It would be good to format it
like other entries but if other reviewers prefer this way, we can go
with this too.

--
Best Wishes,
Ashutosh Bapat

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2025-09-05 11:31:04 Re: pull-up subquery if JOIN-ON contains refs to upper-query
Previous Message Amit Kapila 2025-09-05 11:25:43 Re: Allow using replication origins in SQL level parallel sessions