Re: shared-memory based stats collector - v68

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, gkokolatos(at)protonmail(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v68
Date: 2022-04-04 20:45:40
Message-ID: CAKFQuwajw6Za+j8tj9Kzz1NR+wUw=tO3G=8xLkGD79Mh6VuNMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 3, 2022 at 9:16 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

>
> Please take a look!
>
>
I didn't take the time to fixup all the various odd typos in the general
code comments; none of them reduced comprehension appreciably. I may do so
when/if I do another pass.

I did skim over the entire patch set and, FWIW, found it to be quite
understandable. I don't have the experience to comment on the lower-level
details like locking and such but the medium picture stuff makes sense to
me both as a user and a developer. I did leave a couple of comments about
parts that at least piqued my interest (reset single stats) or seemed like
an undesirable restriction that was under addressed (before server shutdown
called exactly once).

I agree with Thomas's observation regarding PGSTAT_KIND_LAST. I also think
that leaving it starting at 1 makes sense - maybe just fix the name and
comment to better reflect its actual usage in core.

I concur also with changing usages of " / " to ", or"

My first encounter with pg_stat_exists_stat() didn't draw my attention as
being problematic so I'd say we just stick with it. As a SQL user reading:
WHERE exists (...) is somewhat natural; using "have" or back-to-back
stat_stat is less appealing.

I would suggest we do away with stats_fetch_consistency "snapshot" mode and
instead add a function that can be called that would accomplish the same
thing but in "cache" mode. Future iterations of that function could accept
patterns, allowing for something between "one" and "everything".

I'm also not an immediate fan of "fetch_consistency"; with the function
suggestion it is basically "cache" and "no-cache" so maybe:
stats_use_transaction_cache ? (haven't thought hard or long on this one...)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 22d0a1e491..e889c11d9e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2123,7 +2123,7 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss
11:34 0:00 postgres: ser
</row>
<row>
<entry><literal>PgStatsData</literal></entry>
- <entry>Waiting fo shared memory stats data access</entry>
+ <entry>Waiting for shared memory stats data access</entry>
</row>
<row>
<entry><literal>SerializableXactHash</literal></entry>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2689d0962c..bc7bdf8064 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4469,7 +4469,7 @@ PostgresMain(const char *dbname, const char *username)

/*
* (4) turn off the idle-in-transaction, idle-session and
- * idle-state-update timeouts if active. We do this before step (5) so
+ * idle-stats-update timeouts if active. We do this before step (5) so
* that any last-moment timeout is certain to be detected in step (5).
*
* At most one of these timeouts will be active, so there's no need to
diff --git a/src/backend/utils/activity/pgstat.c
b/src/backend/utils/activity/pgstat.c
index dbd55a065d..370638b33b 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -5,7 +5,7 @@
* Provides the infrastructure to collect and access cumulative statistics,
* e.g. per-table access statistics, of all backends in shared memory.
*
- * Most statistics updates are first first accumulated locally in each
process
+ * Most statistics updates are first accumulated locally in each process
* as pending entries, then later flushed to shared memory (just after
commit,
* or by idle-timeout).
*
@@ -371,7 +371,9 @@ pgstat_discard_stats(void)
/*
* pgstat_before_server_shutdown() needs to be called by exactly one
process
* during regular server shutdowns. Otherwise all stats will be lost.
- *
+ * XXX: What bad things happen if this is invoked by more than one process?
+ * I'd presume stats are not actually lost in that case. Can we just
'no-op'
+ * subsequent calls and say "at least once at shutdown, as late as
possible"
* We currently only write out stats for proc_exit(0). We might want to
change
* that at some point... But right now pgstat_discard_stats() would be
called
* during the start after a disorderly shutdown, anyway.
@@ -654,6 +656,14 @@ pgstat_reset_single_counter(PgStat_Kind kind, Oid
objoid)

Assert(!pgstat_kind_info_for(kind)->fixed_amount);

+ /*
+ * More of a conceptual observation here - the fact that something is
fixed does not imply
+ * that it is not fixed at a value greater than zero and thus could have
single subentries
+ * that could be addressed.
+ * I also am unsure, off the top of my head, whether both replication
slots and subscriptions,
+ * which are fixed, can be reset singly (today, and/or whether this patch
enables that capability)
+ */
+
/* Set the reset timestamp for the whole database */
pgstat_reset_database_timestamp(MyDatabaseId, ts);

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-04-04 20:47:04 Re: New Object Access Type hooks
Previous Message Andres Freund 2022-04-04 20:41:31 Re: Mingw task for Cirrus CI