Re: shared-memory based stats collector - v69

From: Andres Freund <andres(at)anarazel(dot)de>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v69
Date: 2022-04-05 21:23:38
Message-ID: 20220405212338.tvhtm5fxhmczaib2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-04-05 13:51:12 -0700, David G. Johnston wrote:
> On Mon, Apr 4, 2022 at 8:05 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > - added an architecture overview comment to the top of pgstat.c - not sure
> > if
> > it makes sense to anybody but me (and perhaps Horiguchi-san)?
> >
> >
> I took a look at this, diff attached.

Thanks!

> Some typos and minor style stuff,
> plus trying to bring a bit more detail to the caching mechanism. I may
> have gotten it wrong in adding more detail though.
>
> + * read-only, backend-local, transaction-scoped, hashtable
> (pgStatEntryRefHash)
> + * in front of the shared hashtable, containing references
> (PgStat_EntryRef)
> + * to shared hashtable entries. The shared hashtable thus only needs to be
> + * accessed when the PgStat_HashKey is not present in the backend-local
> hashtable,
> + * or if stats_fetch_consistency = 'none'.
>
> I'm under the impression, but didn't try to confirm, that the pending
> updates don't use the caching mechanism

They do.

>, but rather add to the shared queue

Queue? Maybe you mean the hashtable?

>, and so the cache is effectively read-only. It is also transaction-scoped
>based upon the GUC and the nature of stats vis-a-vis transactions.

No, that's not right. I think you might be thinking of
pgStatLocal.snapshot.stats?

I guess I should add a paragraph about snapshots / fetch consistency.

> Even before I added the read-only and transaction-scoped I got a bit hung
> up on reading:
> "The shared hashtable only needs to be accessed when no prior reference to
> the shared hashtable exists."

> Thinking in terms of key seems to make more sense than value in this
> sentence - even if there is a one-to-one correspondence.

Maybe "prior reference to the shared hashtable exists for the key"?

> I am wondering why there are no mentions to the header files in this
> architecture, only the .c files.

Hm, I guess, but I'm not sure it'd add a lot? It's really just intended to
give a starting point (and it can't be worse than explanation of the current
system).

> diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
> index bfbfe53deb..504f952c0e 100644
> --- a/src/backend/utils/activity/pgstat.c
> +++ b/src/backend/utils/activity/pgstat.c
> @@ -4,9 +4,9 @@
> *
> *
> * PgStat_KindInfo describes the different types of statistics handled. Some
> - * kinds of statistics are collected for fixed number of objects
> - * (e.g. checkpointer statistics). Other kinds are statistics are collected
> - * for variable-numbered objects (e.g. relations).
> + * kinds of statistics are collected for a fixed number of objects
> + * (e.g., checkpointer statistics). Other kinds of statistics are collected

Was that comma after e.g. intentional?

Applied the rest.

> + * for a varying number of objects (e.g., relations).
> * Fixed-numbered stats are stored in plain (non-dynamic) shared memory.
> *
> @@ -19,19 +19,21 @@
> *
> * All variable-numbered stats are addressed by PgStat_HashKey while running.
> * It is not possible to have statistics for an object that cannot be
> - * addressed that way at runtime. A wider identifier can be used when
> + * addressed that way at runtime. A alternate identifier can be used when
> * serializing to disk (used for replication slot stats).

Not sure this improves things.

> * The names for structs stored in shared memory are prefixed with
> * PgStatShared instead of PgStat.
> @@ -53,15 +55,16 @@
> * entry in pgstat_kind_infos, see PgStat_KindInfo for details.
> *
> *
> - * To keep things manageable stats handling is split across several
> + * To keep things manageable, stats handling is split across several

Done.

> * files. Infrastructure pieces are in:
> - * - pgstat.c - this file, to tie it all together
> + * - pgstat.c - this file, which ties everything together

I liked that :)

> - * Each statistics kind is handled in a dedicated file:
> + * Each statistics kind is handled in a dedicated file, though their structs
> + * are defined here for lack of better ideas.

-0.5

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-04-05 21:43:49 Re: shared-memory based stats collector - v69
Previous Message Greg Stark 2022-04-05 21:14:09 Re: [PATCH] Add extra statistics to explain for Nested Loop