Re: shared-memory based stats collector - v66

From: Andres Freund <andres(at)anarazel(dot)de>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: ibrar(dot)ahmad(at)gmail(dot)com, masao(dot)fujii(at)oss(dot)nttdata(dot)com, gkokolatos(at)protonmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: shared-memory based stats collector - v66
Date: 2022-03-17 07:36:52
Message-ID: 20220317073652.kz3uuh6le6vbfod4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've attached a substantially improved version of the shared memory stats
patch.

The biggest changes are:
- chopped the bigger patches into smaller chunks. Most importantly the
"transactional stats creation / drop" part is now its own commit. Also the
tests I added.

- put in a defense against the danger of loosing function stats entries due to
the lack of cache invalidation during function calls. See long comment in
pgstat_init_function_usage()

- split up pgstat_global.c into pgstat_{archiver, bgwriter, checkpoint,
replslot, slru, wal}. While each individually not large, there were enough
of them to make the file confusing. Feels a lot better to work with.

- replication slot stats used the slot "index" in a dangerous way, fixed

- implemented a few omitted features like resetting all subscriptions and
setting reset timestamps (Melanie)

- loads of code and comment polishing

I think the first few patches are basically ready to be applied and are
independently worthwhile:
- 0001-pgstat-run-pgindent-on-pgstat.c-h.patch
- 0002-pgstat-split-relation-database-stats-handling-ou.patch
- 0003-pgstat-split-out-WAL-handling-from-pgstat_-initi.patch
- 0004-pgstat-introduce-pgstat_relation_should_count.patch
- 0005-pgstat-xact-level-cleanups-consolidation.patch

Might not be worth having separately, should probably just be part of
0014:
- 0006-pgstat-wip-pgstat-relation-init-assoc.patch

A pain to maintain, needs mostly a bit of polishing of file headers. Perhaps I
should rename pgstat_checkpoint.c to pgstat_checkpointer.c, fits better with
function names:
- 0007-pgstat-split-different-types-of-stats-into-separ.patch

This is also painful to maintain. Mostly kept separate from 0007 for easier
reviewing:
- 0009-pgstat-reorder-file-pgstat.c-pgstat.h-contents.patch

Everything after isn't yet quite there / depend on patches that aren't yet
there:
- 0010-pgstat-add-pgstat_copy_relation_stats.patch
- 0011-pgstat-remove-superflous-comments-from-pgstat.h.patch
- 0012-pgstat-stats-collector-references-in-comments.patch
- 0013-pgstat-scaffolding-for-transactional-stats-creat.patch
- 0014-pgstat-store-statistics-in-shared-memory.patch
- 0015-pgstat-add-pg_stat_force_next_flush.patch

Notably this patch makes stat.sql a test that can safely be run concurrent
with other tests:
- 0016-pgstat-utilize-pg_stat_force_next_flush-to-simpl.patch

Needs a tap test as well, but already covers a lot of things that aren't
covered today. Unfortunately it can't really be applied before because it's
too hard to write / slow to run without 0015
- 0017-pgstat-extend-pgstat-test-coverage.patch

I don't yet know what we should do with other users of
PG_STAT_TMP_DIR. There's no need for it for pgstat.c et al anymore. Not sure
that pg_stat_statement is enough of a reason to keep the stats_temp_directory
GUC around?
- 0019-pgstat-wip-remove-stats_temp_directory.patch

Right now we reset stats for replicas, even if we start from a shutdown
checkpoint. That seems pretty unnecessary with this patch:
- 0021-pgstat-wip-only-reset-pgstat-data-after-crash-re.patch

Starting to feel more optimistic about this! There's loads more to do, but now
the TODOs just seem to require elbow grease, rather than deep thinking.

The biggest todos are:
- Address all the remaining AFIXMEs and XXXs

- add longer explanation of architecture to pgstat.c (or a README)

- Further improve our stats test coverage - there's a crapton not covered,
despite 0017:
- test WAL replay with stats (stats for dropped tables are removed etc)
- test crash recovery and "invalid stats file" paths
- lot of the pg_stat_ views like bgwriter, pg_stat_database have zero coverage today

- make naming not "a pain in the neck": [1]

- lots of polishing

- revise docs

- run benchmarks - I've done so in the past, but not recently

- perhaps 0014 can be further broken down - it's still uncomfortably large

It's worth noting that the patchset, leaving new tests aside, has a
substantially negative diffstat, even if one includes all the new file
headers... Once a bunch more cleanup is done, I bet it'll improve further.

with new tests:
80 files changed, 9759 insertions(+), 8051 deletions(-)

without tests (mildly inaccurate):
71 files changed, 6991 insertions(+), 7814 deletions(-)

just shared memory stats patch, not all the code movement, new file headers:
49 files changed, 4079 insertions(+), 5472 deletions(-)

Comments and reviews welcome!

I regularly push to https://github.com/anarazel/postgres/tree/shmstat fwiw -
the series is way too big to spam the list all the time.

Greetings,

Andres Freund

[1] https://postgr.es/m/20220303.170412.1542007127371857370.horikyota.ntt%40gmail.com

Attachment Content-Type Size
v66-0001-pgstat-run-pgindent-on-pgstat.c-h.patch.gz application/x-patch-gzip 2.2 KB
v66-0002-pgstat-split-relation-database-stats-handling-ou.patch.gz application/x-patch-gzip 1.9 KB
v66-0003-pgstat-split-out-WAL-handling-from-pgstat_-initi.patch.gz application/x-patch-gzip 1.4 KB
v66-0004-pgstat-introduce-pgstat_relation_should_count.patch.gz application/x-patch-gzip 1.6 KB
v66-0005-pgstat-xact-level-cleanups-consolidation.patch.gz application/x-patch-gzip 1.8 KB
v66-0006-pgstat-wip-pgstat-relation-init-assoc.patch.gz application/x-patch-gzip 2.0 KB
v66-0007-pgstat-split-different-types-of-stats-into-separ.patch.gz application/x-patch-gzip 30.0 KB
v66-0008-pgstat-fix-function-name.patch.gz application/x-patch-gzip 516 bytes
v66-0009-pgstat-reorder-file-pgstat.c-pgstat.h-contents.patch.gz application/x-patch-gzip 11.0 KB
v66-0010-pgstat-add-pgstat_copy_relation_stats.patch.gz application/x-patch-gzip 1.4 KB
v66-0011-pgstat-remove-superflous-comments-from-pgstat.h.patch.gz application/x-patch-gzip 1.2 KB
v66-0012-pgstat-stats-collector-references-in-comments.patch.gz application/x-patch-gzip 8.3 KB
v66-0013-pgstat-scaffolding-for-transactional-stats-creat.patch.gz application/x-patch-gzip 8.0 KB
v66-0014-pgstat-store-statistics-in-shared-memory.patch.gz application/x-patch-gzip 88.5 KB
v66-0015-pgstat-add-pg_stat_force_next_flush.patch.gz application/x-patch-gzip 1.6 KB
v66-0016-pgstat-utilize-pg_stat_force_next_flush-to-simpl.patch.gz application/x-patch-gzip 4.4 KB
v66-0017-pgstat-extend-pgstat-test-coverage.patch.gz application/x-patch-gzip 5.6 KB
v66-0018-pgstat-move-pgstat.c-to-utils-activity.patch.gz application/x-patch-gzip 916 bytes
v66-0019-pgstat-wip-remove-stats_temp_directory.patch.gz application/x-patch-gzip 3.0 KB
v66-0020-pgstat-rename-STATS_COLLECTOR-GUC-group-to-STATS.patch.gz application/x-patch-gzip 1.4 KB
v66-0021-pgstat-wip-only-reset-pgstat-data-after-crash-re.patch.gz application/x-patch-gzip 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-17 07:39:52 Re: pg_tablespace_location() failure with allow_in_place_tablespaces
Previous Message Michael Paquier 2022-03-17 07:22:14 Re: Out-of-tree certificate interferes ssltest