Re: shared memory stats: high level design decisions: consistency, dropping

From: Andres Freund <andres(at)anarazel(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: shared memory stats: high level design decisions: consistency, dropping
Date: 2021-04-05 09:38:32
Message-ID: 20210405093832.okdec2ktys2lc4pl@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I did end up implementing the configurable fetch consistency. Seems to
work well, and it's not that much code after a bunch of other
cleanups. See below the quoted part at the bottom.

I just posted the latest iteration of that code at
https://www.postgresql.org/message-id/20210405092914.mmxqe7j56lsjfsej%40alap3.anarazel.de

Copying from that mail:

I've spent most of the last 2 1/2 weeks on this now. Unfortunately I think
that, while it has gotten a lot closer, it's still about a week's worth of
work away from being committable.

My main concerns are:

- Test Coverage:

I've added a fair bit of tests, but it's still pretty bad. There were a lot
of easy-to-hit bugs in earlier versions that nevertheless passed the test
just fine.

Due to the addition of pg_stat_force_next_flush(), and that there's no need
to wait for the stats collector to write out files, it's now a lot more
realistic to have proper testing of a lot of the pgstat.c code.

- Architectural Review

I rejiggered the patchset pretty significantly, and I think it needs more
review than I see as realistic in the next two days. In particular I don't
think

- Performance Testing

I did a small amount, but given that this touches just about every query
etc, I think that's not enough. My changes unfortunately are substantial
enough to invalidate Horiguchi-san's earlier tests.

- Currently there's a corner case in which function (but not table!) stats
for a dropped function may not be removed. That possibly is not too bad,

- Too many FIXMEs still open

It is quite disappointing to not have the patch go into v14 :(. But I just
don't quite see the path right now. But maybe I am just too tired right now,
and it'll look better tomorrow (err today, in a few hours).

One aspect making this particularly annoying is that there's a number of
stats additions in v14 that'd be easier to make robust with the shared
memory based approach. But I don't think I can get it a committable
shape in 2 days. Nor is it clear to that that it'd be a good idea to
commit it, even if I could just about make it, given that pretty much
everything involves pgstat somewhere.

On 2021-03-25 23:52:58 +0100, Magnus Hagander wrote:
> > Yea, I was thinking similar. We could e.g. have
> > stats_snapshot_consistency, an enum of 'none', 'query', 'snapshot'.
> >
> > With 'none' there's no consistency beyond that a single pg_stat_get_*
> > function call will give consistent results.
> >
> > With 'query' we cache each object on first access (i.e. there can be
> > inconsistency between different objects if their accesses are further
> > appart, but not within a stats object).
> >
> > With 'snapshot' we cache all the stats at the first access, for the
> > duration of the transaction.
> >
> >
> > However: I think 'query' is surprisingly hard to implement. There can be
> > multiple ongoing overlapping queries ongoing at the same time. And even
> > if there aren't multiple ongoing queries, there's really nothing great
> > to hook a reset into.
> >
> > So I'm inclined to instead have 'access', which caches individual stats
> > object on first access. But lasts longer than the query.

I went with stats_fetch_consistency = {snapshot, cache, none}, that
seems a bit more obvious than cache. I haven't yet changed [auto]vacuum
so it uses 'none' unconditionally - but that shouldn't be hard.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2021-04-05 10:20:20 Re: Why reset pgstat during recovery
Previous Message Andres Freund 2021-04-05 09:29:14 Re: shared-memory based stats collector