Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date: 2022-03-31 04:20:38
Message-ID: 20220331042038.udnydrj4x45i4khb@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-03-30 21:04:07 -0700, Andres Freund wrote:
> On 2022-03-30 20:35:25 -0700, Peter Geoghegan wrote:
> > On Wed, Mar 30, 2022 at 8:28 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > I triggered twice now, but it took a while longer the second time.
> >
> > Great.
> >
> > I wonder if you can get an RR recording...
>
> Started it, but looks like it's too slow.
>
> (gdb) p MyProcPid
> $1 = 2172500
>
> (gdb) p vacrel->NewRelfrozenXid
> $3 = 717
> (gdb) p vacrel->relfrozenxid
> $4 = 717
> (gdb) p OldestXmin
> $5 = 5112
> (gdb) p aggressive
> $6 = false

I added a bunch of debug elogs to see what sets *frozenxid_updated to true.

(gdb) p *vacrel
$1 = {rel = 0x7fe24f3e0148, indrels = 0x7fe255c17ef8, nindexes = 2, aggressive = false, skipwithvm = true, failsafe_active = false,
consider_bypass_optimization = true, do_index_vacuuming = true, do_index_cleanup = true, do_rel_truncate = true, bstrategy = 0x7fe255bb0e28, pvs = 0x0,
relfrozenxid = 717, relminmxid = 6, old_live_tuples = 42, OldestXmin = 20751, vistest = 0x7fe255058970 <GlobalVisSharedRels>, FreezeLimit = 4244988047,
MultiXactCutoff = 4289967302, NewRelfrozenXid = 717, NewRelminMxid = 6, skippedallvis = false, relnamespace = 0x7fe255c17bf8 "pg_catalog",
relname = 0x7fe255c17cb8 "pg_database", indname = 0x0, blkno = 4294967295, offnum = 0, phase = VACUUM_ERRCB_PHASE_SCAN_HEAP, verbose = false,
dead_items = 0x7fe255c131d0, rel_pages = 8, scanned_pages = 8, removed_pages = 0, lpdead_item_pages = 0, missed_dead_pages = 0, nonempty_pages = 8,
new_rel_tuples = 124, new_live_tuples = 42, indstats = 0x7fe255c18320, num_index_scans = 0, tuples_deleted = 0, lpdead_items = 0, live_tuples = 42,
recently_dead_tuples = 82, missed_dead_tuples = 0}

But the debug elog reports that

relfrozenxid updated 714 -> 717
relminmxid updated 1 -> 6

Tthe problem is that the crashing backend reads the relfrozenxid/relminmxid
from the shared relcache init file written by another backend:

2022-03-30 21:10:47.626 PDT [2625038][autovacuum worker][6/433:0][] LOG: automatic vacuum of table "contrib_regression_postgres_fdw.pg_catalog.pg_database": index scans: 1
pages: 0 removed, 8 remain, 8 scanned (100.00% of total)
tuples: 4 removed, 114 remain, 72 are dead but not yet removable
removable cutoff: 20751, older by 596 xids when operation ended
new relfrozenxid: 717, which is 3 xids ahead of previous value
new relminmxid: 6, which is 5 mxids ahead of previous value
index scan needed: 3 pages from table (37.50% of total) had 8 dead item identifiers removed
index "pg_database_datname_index": pages: 2 in total, 0 newly deleted, 0 currently deleted, 0 reusable
index "pg_database_oid_index": pages: 6 in total, 0 newly deleted, 2 currently deleted, 2 reusable
I/O timings: read: 0.050 ms, write: 0.102 ms
avg read rate: 209.860 MB/s, avg write rate: 76.313 MB/s
buffer usage: 42 hits, 22 misses, 8 dirtied
WAL usage: 13 records, 5 full page images, 33950 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
...
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][:0][] DEBUG: InitPostgres
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/0:0][] DEBUG: my backend ID is 6
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/0:0][] LOG: reading shared init file
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/443:0][] DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/sub>
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/443:0][] LOG: reading non-shared init file

This is basically the inverse of a54e1f15 - we read a *newer* horizon. That's
normally fairly harmless - I think.

Perhaps we should just fetch the horizons from the "local" catalog for shared
rels?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-31 04:24:43 Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Previous Message Dilip Kumar 2022-03-31 04:16:16 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints