Re: Add new option 'all' to pg_stat_reset_shared()

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, boekewurm+postgres(at)gmail(dot)com, Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add new option 'all' to pg_stat_reset_shared()
Date: 2023-11-06 08:30:13
Message-ID: CALj2ACVjpHfkCUWSMfpX-oyuqDCenbbduxNfQWPUBXZ0TChMrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> Thanks all for the comments!
>
> On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> > Knowing that your metrics have a shared starting point can be quite
> > valuable, as it allows you to do some math that would otherwise be
> > much less accurate when working with stats over a short amount of
> > time. I've not used these stats systems much myself, but skew between
> > metrics caused by different reset points can be difficult to detect
> > and debug, so I think an atomic call to reset all these stats could be
> > worth implementing.
>
> Since each stats, except wal_prefetch was reset acquiring LWLock,
> attached PoC patch makes the call atomic by using these LWlocks.
>
> If this is the right direction, I'll try to make wal_prefetch also take
> LWLock.

+ // Acquire LWLocks
+ LWLock *locks[] = {&stats_archiver->lock, &stats_bgwriter->lock,
+ &stats_checkpointer->lock, &stats_wal->lock};
+
+ for (int i = 0; i < STATS_SHARED_NUM_LWLOCK; i++)
+ LWLockAcquire(locks[i], LW_EXCLUSIVE);
+
+ for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+ {
+ LWLock *bktype_lock = &stats_io->locks[i];
+ LWLockAcquire(bktype_lock, LW_EXCLUSIVE);
+ }

Well, that's a total of ~17 LWLocks this new function takes to make
the stats reset atomic. I'm not sure if this atomicity is worth the
effort which can easily be misused - what if someone runs something
like SELECT pg_stat_reset_shared() FROM generate_series(1,
100000....n); to cause heavy lock acquisition and release cycles?

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

> On 2023-11-04 10:49, Andres Freund wrote:
>
> > Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(),
> > instead of
> > pg_stat_reset_shared('all') for this purpose.
>
> In the attached PoC patch the shared statistics are reset by calling
> pg_stat_reset_shared() not pg_stat_reset_shared('all').

Some quick comments:

1.
+/*
+pg_stat_reset_shared_all(PG_FUNCTION_ARGS)
+{
+ pgstat_reset_shared_all();
+ PG_RETURN_VOID();
+}

IMO, simpler way is to move the existing code in
pg_stat_reset_shared() to a common internal function like
pgstat_reset_shared(char *target) and the pg_stat_reset_shared_all()
can just loop over all the stats targets.

2.
+{ oid => '8000',
+ descr => 'statistics: reset collected statistics shared across the cluster',
+ proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void',
+ proargtypes => '', prosrc => 'pg_stat_reset_shared_all' },

Why a new function consuming the oid? Why can't we just do the trick
of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else
{reset specified stats kind} like the pg_stat_reset_slru()?

3. I think the new reset all stats function must also consider
resetting all SLRU stats, no?
/* stats for fixed-numbered objects */
PGSTAT_KIND_ARCHIVER,
PGSTAT_KIND_BGWRITER,
PGSTAT_KIND_CHECKPOINTER,
PGSTAT_KIND_IO,
PGSTAT_KIND_SLRU,
PGSTAT_KIND_WAL,

4. I think the new reset all stats function must also consider
resetting recovery_prefetch.

5.
+ If no argument is specified, reset all these views at once.
+ Note current patch is WIP and pg_stat_recovery_prefetch is not reset.
</para>

How about "If the argument is NULL, all counters shown in all of these
views are reset."?

6. Add a test case to cover the code in stats.sql.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2023-11-06 08:32:27 Re: Version 14/15 documentation Section "Alter Default Privileges"
Previous Message Amit Kapila 2023-11-06 08:27:42 Re: Synchronizing slots from primary to standby