Re: shared-memory based stats collector

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, gkokolatos(at)protonmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: shared-memory based stats collector
Date: 2021-03-13 06:20:40
Message-ID: 20210313062040.sbw7yzvgftu6za5t@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-03-11 19:22:57 -0800, Andres Freund wrote:
> I started changing the patch to address my complaints. I'll try to do
> it as an incremental patch ontop of your 0004, but it might become too
> unwieldy. Not planning to touch other patches for now (and would be
> happy if the first few were committed). I do think we'll have to
> split 0004 a bit - it's just too large to commit as is, I think.

Far from being done (and the individual commits is just me working, not
something that is intended to survive), but I thought it might be
helpful to post my WIP git tree:
https://github.com/anarazel/postgres/commits/shmstat

I do think that the removal of the the "attach" logic, as well as the
more explicit naming differentiating between the three different hash
tables is a pretty clear improvement. It's way too easy to get turned
around otherwise.

I suspect to make all of this workable we'll have to add a few
preliminary cleanup patches. I'm currently thinking:

1) There's a bunch of functions in places that don't make sense.

/* ------------------------------------------------------------
* Local support functions follow
* ------------------------------------------------------------
*/

[internal stuff]
...
void
pgstat_send_archiver(const char *xlog, bool failed)
...
void
pgstat_send_bgwriter(void)
...
void
pgstat_report_wal(void)
...
bool
pgstat_send_wal(bool force)
...
[internal stuff]
...
void
pgstat_count_slru_page_zeroed(int slru_idx)

...

I think it'd make sense to separately clean that up.

2) src/backend/postmaster/pgstat.c currently contains at least two,
effectively independent, subsystems. Arguably more:
a) cumulative stats collection infrastructure: sending data to the
persistent stats file, and reading from it.
b) "content aware" cumulative statistics function

c) "current" activity infrastructure, around PgBackendStatus (which
basically boils down to pg_stat_activity et al.)
d) wait events
e) command progress stuff

They don't actually have much to do with each other, except being
related to stats. Even without the shared memory stats, having these
be in one file makes pretty much no sense. Having them all under one
common pgstat_* prefix is endlessly confusing too.

I think before making things differently complicated with this patch,
we need to clean this up, unfortunately. I think we should initially have
- src/backend/postmaster/pgstat.c, for a), b) above
- src/backend/utils/activity/backend_status.c for c)
- src/backend/utils/activity/wait_events.c for d)
- src/backend/utils/activity/progress.c for e)

Not at all sure about the names, but something roughly like this
would im make sense.

The next thing to note is that after this whole patchseries, having the
remaining functionality in src/backend/postmaster/pgstat.c doesn't make
sense. The things in postmaster/ are related to postmaster
sub-processes, not random pieces of backend infrastructure. Therefore I
am thinking that patch 0004 should be changed so it basically adds all
the changed code to two new files:
- src/backend/utils/activity/stats.c - for the stats keeping
infrastructure (shared memory hash table, pending table, etc)
- src/backend/utils/activity/stats_report.c - for pgstat_report_*,
pgstat_count_*, pgstat_update_*, flush_*, i.e. everything that knows
about specific kinds of stats.

The reason for that split is that imo the two pieces of code are largely
independent. One shouldn't need to understand the way stats are stored
in any sort of detail to add a new stats field and vice versa.

Horiguchi-san, is there a chance you could add a few tests (on master)
that test/document the way stats are kept across "normal" restarts, and
thrown away after crash restarts/immediate restarts and also thrown away
graceful streaming rep shutdowns?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-13 06:22:54 Re: pg_amcheck contrib application
Previous Message Mark Dilger 2021-03-13 06:18:32 Re: pg_amcheck contrib application