More problems with VacuumPageHit style global variables

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: More problems with VacuumPageHit style global variables
Date: 2022-04-20 22:03:09
Message-ID: CAH2-WznJaa2kTCBtd83+rm7Bqm6hcuYg1Lq8VewkmzJR5jJ0jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

My recent bugfix commit d3609dd2 addressed an issue with VACUUM
VERBOSE. It would aggregate buffers hit/missed/dirtied counts
incorrectly (by double counting), though only when there are multiple
heap rels processed by the same VACUUM command. It failed to account
for the fact that the VacuumPageHit, VacuumPageMiss, and
VacuumPageDirty global variables were really only designed to work in
autovacuum (see 2011 commit 9d3b5024).

I just realized that there is one remaining problem: parallel VACUUM
doesn't care about these global variables, so there will still be
discrepancies there. I can't really blame that on parallel VACUUM,
though, because vacuumparallel.c at least copies buffer usage counters
from parallel workers (stored in its PARALLEL_VACUUM_KEY_BUFFER_USAGE
space). I wonder why we still have these seemingly redundant global
variables, which are maintained by bufmgr.c (alongside the
pgBufferUsage stuff). It looks like recent commit 5dc0418fab
("Prefetch data referenced by the WAL, take II") taught bufmgr.c to
instrument all buffer accesses. So it looks like we just don't need
VacuumPageHit and friends anymore.

Wouldn't it be better if every VACUUM used the same generic approach,
using pgBufferUsage? As a general rule code that only runs in
autovacuum is a recipe for bugs. It looks like VacuumPageHit is
maintained based on different rules to pgBufferUsage.shared_blks_hit
in bufmgr.c (just as an example), which seems like a bad sign.
Besides, the pgBufferUsage counters have more information, which seems
like it might be useful to the lazyvacuum.c instrumentation.

One question for the author of the WAL prefetch patch, Thomas (CC'd):
It's not 100% clear what the expectation is with pgBufferUsage when
track_io_timing is off, so are fields like
pgBufferUsage.shared_blks_hit (i.e. those that don't have a
time/duration component) officially okay to rely on across the board?
It looks like they are okay to rely on (even when track_io_timing is
off), but it would be nice to put that on a formal footing, if it
isn't already.

--
Peter Geoghegan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-04-20 23:03:25 Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)
Previous Message Tom Lane 2022-04-20 21:43:09 Re: Query generates infinite loop