Re: shared-memory based stats collector - v70

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: andres(at)anarazel(dot)de
Cc: melanieplageman(at)gmail(dot)com, pryzby(at)telsasoft(dot)com, thomas(dot)munro(at)gmail(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: shared-memory based stats collector - v70
Date: 2022-04-08 04:44:43
Message-ID: 20220408.134443.298969491538816073.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 7 Apr 2022 20:59:21 -0700, Andres Freund <andres(at)anarazel(dot)de> wrote in
> Hi,
>
> On 2022-04-08 11:10:14 +0900, Kyotaro Horiguchi wrote:
> > I can read it. But I'm not sure that the difference is obvious for
> > average users between "starting a standby from a basebackup" and
> > "starting a standby after a normal shutdown"..
>
> Yea, that's what I was concerned about. How about:
>
> <para>
> Cumulative statistics are collected in shared memory. Every
> <productname>PostgreSQL</productname> process collects statistics locally
> then updates the shared data at appropriate intervals. When a server,
> including a physical replica, shuts down cleanly, a permanent copy of the
> statistics data is stored in the <filename>pg_stat</filename> subdirectory,
> so that statistics can be retained across server restarts. In contrast,
> when starting from an unclean shutdown (e.g., after an immediate shutdown,
> a server crash, starting from a base backup, and point-in-time recovery),
> all statistics counters are reset.
> </para>

Looks perfect generally, and especially in regard to the concern.

> I think I like my version above a bit better?

Quite a bit. It didn't answer for the concern.

> > > 2)
> > > The edit is not a problem, but it's hard to understand what the existing
> > > paragraph actually means?
> > >
> > > diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
> > > index 3247e056663..8bfb584b752 100644
> > > --- a/doc/src/sgml/high-availability.sgml
> > > +++ b/doc/src/sgml/high-availability.sgml
> > > @@ -2222,17 +2222,17 @@ HINT: You can then restart the server after making the necessary configuration
> > > ...
> > > <para>
> > > - The statistics collector is active during recovery. All scans, reads, blocks,
> > > + The cumulative statistics system is active during recovery. All scans, reads, blocks,
> > > index usage, etc., will be recorded normally on the standby. Replayed
> > > actions will not duplicate their effects on primary, so replaying an
> > > insert will not increment the Inserts column of pg_stat_user_tables.
> > > The stats file is deleted at the start of recovery, so stats from primary
> > > and standby will differ; this is considered a feature, not a bug.
> > > </para>
> > >
> > > <para>
> >
> > Agreed partially. It's too detailed. It might not need to mention WAL
> > replay.
>
> My concern is more that it seems halfway nonsensical. "Replayed actions will
> not duplicate their effects on primary" - I can guess what that means, but not
> more. There's no "Inserts" column of pg_stat_user_tables.
>
>
> <para>
> The cumulative statistics system is active during recovery. All scans,
> reads, blocks, index usage, etc., will be recorded normally on the
> standby. However, WAL replay will not increment relation and database
> specific counters. I.e. replay will not increment pg_stat_all_tables
> columns (like n_tup_ins), nor will reads or writes performed by the
> startup process be tracked in the pg_statio views, nor will associated
> pg_stat_database columns be incremented.
> </para>

Looks clearer since it mention user-facing interfaces with concrete
example columns.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message wangw.fnst@fujitsu.com 2022-04-08 05:09:50 RE: Logical replication timeout problem
Previous Message Tom Lane 2022-04-08 04:40:55 Re: Windows now has fdatasync()