Re: shared-memory based stats collector - v68

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Georgios <gkokolatos(at)protonmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector - v68
Date: 2022-04-04 13:16:04
Message-ID: CA+hUKGL9hY_VY=+oUK+Gc1iSRx-Ls5qeYJ6q=dQVZnT3R63Taw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 4, 2022 at 4:16 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Please take a look!

A few superficial comments:

> [PATCH v68 01/31] pgstat: consistent function header formatting.
> [PATCH v68 02/31] pgstat: remove some superflous comments from pgstat.h.

+1

> [PATCH v68 03/31] dshash: revise sequential scan support.

Logic looks good. That is,
lock-0-and-ensure_valid_bucket_pointers()-only-once makes sense. Just
some comment trivia:

+ * dshash_seq_term needs to be called when a scan finished. The caller may
+ * delete returned elements midst of a scan by using dshash_delete_current()
+ * if exclusive = true.

s/scan finished/scan is finished/
s/midst of/during/ (or /in the middle of/, ...)

> [PATCH v68 04/31] dsm: allow use in single user mode.

LGTM.

+ Assert(IsUnderPostmaster || !IsPostmasterEnvironment);

(Not this patch's fault, but I wish we had a more explicit way to say "am
single user".)

> [PATCH v68 05/31] pgstat: stats collector references in comments

LGTM. I could think of some alternative suggested names for this subsystem,
but don't think it would be helpful at this juncture so I will refrain :-)

> [PATCH v68 06/31] pgstat: add pgstat_copy_relation_stats().
> [PATCH v68 07/31] pgstat: move transactional code into pgstat_xact.c.

LGTM.

> [PATCH v68 08/31] pgstat: introduce PgStat_Kind enum.

+#define PGSTAT_KIND_FIRST PGSTAT_KIND_DATABASE
+#define PGSTAT_KIND_LAST PGSTAT_KIND_WAL
+#define PGSTAT_NUM_KINDS (PGSTAT_KIND_LAST + 1)

It's a little confusing that PGSTAT_NUM_KINDS isn't really the number of kinds,
because there is no kind 0. For the two users of it... maybe just use
pgstat_kind_infos[] = {...}, and
global_valid[PGSTAT_KIND_LAST + 1]?

> [PATCH v68 10/31] pgstat: scaffolding for transactional stats creation / drop.

+ /*
+ * Dropping the statistics for objects that dropped transactionally itself
+ * needs to be transactional. ...

Hard to parse. How about: "Objects are dropped transactionally, so
related statistics need to be dropped transactionally too."

> [PATCH v68 13/31] pgstat: store statistics in shared memory.

+ * Single-writer stats use the changecount mechanism to achieve low-overhead
+ * writes - they're obviously performance critical than reads. Check the
+ * definition of struct PgBackendStatus for some explanation of the
+ * changecount mechanism.

Missing word "more" after obviously?

+ /*
+ * Whenever the for a dropped stats entry could not be freed (because
+ * backends still have references), this is incremented, causing backends
+ * to run pgstat_gc_entry_refs(), allowing that memory to be reclaimed.
+ */
+ pg_atomic_uint64 gc_count;

Whenever the ...?

Would it be better to call this variable gc_request_count?

+ * Initialize refcount to 1, marking it as valid / not tdroped. The entry

s/tdroped/dropped/

+ * further if a longer lived references is needed.

s/references/reference/

+ /*
+ * There are legitimate cases where the old stats entry might not
+ * yet have been dropped by the time it's reused. The easiest case
+ * are replication slot stats. But oid wraparound can lead to
+ * other cases as well. We just reset the stats to their plain
+ * state.
+ */
+ shheader = pgstat_reinit_entry(kind, shhashent);

This whole comment is repeated in pgstat_reinit_entry and its caller.

+ /*
+ * XXX: Might be worth adding some frobbing of the allocation before
+ * freeing, to make it easier to detect use-after-free style bugs.
+ */
+ dsa_free(pgStatLocal.dsa, pdsa);

FWIW dsa_free() clobbers memory in assert builds, just like pfree().

+static Size
+pgstat_dsa_init_size(void)
+{
+ Size sz;
+
+ /*
+ * The dshash header / initial buckets array needs to fit into "plain"
+ * shared memory, but it's beneficial to not need dsm segments
+ * immediately. A size of 256kB seems works well and is not
+ * disproportional compared to other constant sized shared memory
+ * allocations. NB: To avoid DSMs further, the user can configure
+ * min_dynamic_shared_memory.
+ */
+ sz = 256 * 1024;

It kinda bothers me that the memory reserved by
min_dynamic_shared_memory might eventually fill up with stats, and not
be available for temporary use by parallel queries (which can benefit
more from fast acquire/release on DSMs, and probably also huge pages,
or maybe not...), and that's hard to diagnose.

+ * (4) turn off the idle-in-transaction, idle-session and
+ * idle-state-update timeouts if active. We do this before step (5) so

s/idle-state-/idle-stats-/

+ /*
+ * Some of the pending stats may have not been flushed due to lock
+ * contention. If we have such pending stats here, let the caller know
+ * the retry interval.
+ */
+ if (partial_flush)
+ {

I think it's better for a comment that is outside the block to say "If
some of the pending...". Or the comment should be inside the blocks.

+static void
+pgstat_build_snapshot(void)
+{
...
+ dshash_seq_init(&hstat, pgStatLocal.shared_hash, false);
+ while ((p = dshash_seq_next(&hstat)) != NULL)
+ {
...
+ entry->data = MemoryContextAlloc(pgStatLocal.snapshot.context,
...
+ }
+ dshash_seq_term(&hstat);

Doesn't allocation failure leave the shared hash table locked?

> PATCH v68 16/31] pgstat: add pg_stat_exists_stat() for easier testing.

pg_stat_exists_stat() is a weird name, ... would it be better as
pg_stat_object_exists()?

> [PATCH v68 28/31] pgstat: update docs.

+ Determines the behaviour when cumulative statistics are accessed

AFAIK our manual is written in en_US, so s/behaviour/behavior/.

+ memory. When set to <literal>cache</literal>, the first access to
+ statistics for an object caches those statistics until the end of the
+ transaction / until <function>pg_stat_clear_snapshot()</function> is

s|/|unless|

+ <literal>none</literal> is most suitable for monitoring solutions. If

I'd change "solutions" to "tools" or maybe "systems".

+ When using the accumulated statistics views and functions to
monitor collected data, it is important

Did you intend to write "accumulated" instead of "cumulative" here?

+ You can invoke <function>pg_stat_clear_snapshot</function>() to discard the
+ current transaction's statistics snapshot / cache (if any). The next use

I'd change s|/ cache|or cached values|. I think "/" like this is an informal
thing.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mariam Fahmy 2022-04-04 13:16:24 GSoC: pgmoneta, storage API
Previous Message Peter Eisentraut 2022-04-04 13:08:12 Re: pg_rewind copies