Re: shared-memory based stats collector

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Georgios <gkokolatos(at)protonmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: shared-memory based stats collector
Date: 2021-03-13 11:53:30
Message-ID: CABUevEw49ESGX0gf1xqBqJgt76OkiVuZRQD0G1Cy58Nhv=txUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 13, 2021 at 7:20 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2021-03-11 19:22:57 -0800, Andres Freund wrote:
> > I started changing the patch to address my complaints. I'll try to do
> > it as an incremental patch ontop of your 0004, but it might become too
> > unwieldy. Not planning to touch other patches for now (and would be
> > happy if the first few were committed). I do think we'll have to
> > split 0004 a bit - it's just too large to commit as is, I think.
>
> Far from being done (and the individual commits is just me working, not
> something that is intended to survive), but I thought it might be
> helpful to post my WIP git tree:
> https://github.com/anarazel/postgres/commits/shmstat
>
> I do think that the removal of the the "attach" logic, as well as the
> more explicit naming differentiating between the three different hash
> tables is a pretty clear improvement. It's way too easy to get turned
> around otherwise.
>
>
> I suspect to make all of this workable we'll have to add a few
> preliminary cleanup patches. I'm currently thinking:
>
> 1) There's a bunch of functions in places that don't make sense.
>
>
> /* ------------------------------------------------------------
> * Local support functions follow
> * ------------------------------------------------------------
> */
>
> [internal stuff]
> ...
> void
> pgstat_send_archiver(const char *xlog, bool failed)
> ...
> void
> pgstat_send_bgwriter(void)
> ...
> void
> pgstat_report_wal(void)
> ...
> bool
> pgstat_send_wal(bool force)
> ...
> [internal stuff]
> ...
> void
> pgstat_count_slru_page_zeroed(int slru_idx)
>
> ...
>
> I think it'd make sense to separately clean that up.
>
>
> 2) src/backend/postmaster/pgstat.c currently contains at least two,
> effectively independent, subsystems. Arguably more:
> a) cumulative stats collection infrastructure: sending data to the
> persistent stats file, and reading from it.
> b) "content aware" cumulative statistics function
>
> c) "current" activity infrastructure, around PgBackendStatus (which
> basically boils down to pg_stat_activity et al.)
> d) wait events
> e) command progress stuff
>
> They don't actually have much to do with each other, except being
> related to stats. Even without the shared memory stats, having these
> be in one file makes pretty much no sense. Having them all under one
> common pgstat_* prefix is endlessly confusing too.
>
> I think before making things differently complicated with this patch,
> we need to clean this up, unfortunately. I think we should initially have
> - src/backend/postmaster/pgstat.c, for a), b) above
> - src/backend/utils/activity/backend_status.c for c)
> - src/backend/utils/activity/wait_events.c for d)
> - src/backend/utils/activity/progress.c for e)
>
> Not at all sure about the names, but something roughly like this
> would im make sense.

+1, definitely.

I've been thinking in these lines more than once when poking at
differen patches around that area, but they've never been big enough
to justify the restructuring on their own. Which then of course just
helps accumulate the problem...

If anything, I'd even consider splitting (a) and (b) above out into
separate ones as well. But hey, I see you got to that in your next
paragraph :)

This does seem like a good time to do it, given the size of this
suggested change.

> The next thing to note is that after this whole patchseries, having the
> remaining functionality in src/backend/postmaster/pgstat.c doesn't make
> sense. The things in postmaster/ are related to postmaster
> sub-processes, not random pieces of backend infrastructure. Therefore I
> am thinking that patch 0004 should be changed so it basically adds all
> the changed code to two new files:
> - src/backend/utils/activity/stats.c - for the stats keeping
> infrastructure (shared memory hash table, pending table, etc)
> - src/backend/utils/activity/stats_report.c - for pgstat_report_*,
> pgstat_count_*, pgstat_update_*, flush_*, i.e. everything that knows
> about specific kinds of stats.
>
> The reason for that split is that imo the two pieces of code are largely
> independent. One shouldn't need to understand the way stats are stored
> in any sort of detail to add a new stats field and vice versa.

Agreed as well.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-03-13 13:49:50 Re: pg_amcheck contrib application
Previous Message Julien Rouhaud 2021-03-13 11:49:00 Re: pl/pgsql feature request: shorthand for argument and local variable references