autovacuum launcher crash: assert in pgstat_count_io_op (IOOP_EXTEND on pg_database's VM)

From: Ewan Young <kdbase(dot)hack(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: autovacuum launcher crash: assert in pgstat_count_io_op (IOOP_EXTEND on pg_database's VM)
Date: 2026-05-31 04:36:45
Message-ID: CAON2xHNOyaN9MCZohhD_NL6as3QVhGA0SOn2Hyi9w6+Y-_1bFA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I was stress-testing master (commit e2b35735b00, assertions enabled) with a
workload that does a lot of DDL/DML, including creating and dropping
databases in a tight loop, and the autovacuum launcher kept crashing on me
--
every 15-40 minutes or so once it was under load:

TRAP: failed Assert("pgstat_tracks_io_op(MyBackendType, io_object,
io_context, io_op)"), File: "pgstat_io.c", Line: 74
LOG: autovacuum launcher process (PID ...) was terminated by signal 6:
Aborted

The postmaster recovers fine, but it just starts another launcher that hits
the exact same assert, so it never really gets out of the loop.

The short version: the launcher is in get_database_list(), doing its seqscan
of pg_database, and on-access pruning kicks in during the scan. Since
b46e1e54d07 ("Allow on-access pruning to set pages all-visible"),
heap_page_prune_opt() pins the visibility map unconditionally once it
decides
to prune -- before it ever checks rel_read_only. visibilitymap_pin() isn't
read-only though: if the VM page isn't there yet it extends the fork, and
pg_database has no VM fork, so we end up doing an actual relation extend
(IOOP_EXTEND) from the launcher. pgstat_tracks_io_op() says the launcher
must never do an EXTEND, hence the assertion.

What surprised me is that the launcher's catalog scan isn't even flagged
read-only (table_beginscan_catalog doesn't set SO_HINT_REL_READ_ONLY),
so it never actually intends to set the VM -- it just pins/extends it
anyway.

Here are the relevant frames:
#3 ExceptionalCondition ("pgstat_tracks_io_op(...)", "pgstat_io.c", 74)
at assert.c:65
#4 pgstat_count_io_op (io_object=IOOBJECT_RELATION,
io_context=IOCONTEXT_NORMAL, io_op=IOOP_EXTEND, cnt=1, bytes=8192)
at pgstat_io.c:74
#5 pgstat_count_io_op_time (...) at pgstat_io.c:160
at bufmgr.c:3030
#7 ExtendBufferedRelCommon (... fork=VISIBILITYMAP_FORKNUM ...)
at bufmgr.c:2774
#8 ExtendBufferedRelTo (... fork=VISIBILITYMAP_FORKNUM, extend_to=1 ...)
at bufmgr.c:1099
#9 vm_extend (vm_nblocks=1, ...) at visibilitymap.c:614
#10 vm_readbuf (blkno=0, extend=true) at visibilitymap.c:572
#11 visibilitymap_pin (...) at visibilitymap.c:216
#12 heap_page_prune_opt (..., rel_read_only=...) at pruneheap.c:339
#13 heap_prepare_pagescan (...) at heapam.c:638
#14 heapgettup_pagemode (... ForwardScanDirection ...) at heapam.c:1113
#15 heap_getnext (...) at heapam.c:1454
#16 get_database_list () at autovacuum.c:1856
#17 do_start_worker () at autovacuum.c:1172
#19 launch_worker (...) at autovacuum.c:1355
#20 AutoVacLauncherMain (...) at autovacuum.c:780
#21 postmaster_child_launch (child_type=B_AUTOVAC_LAUNCHER, ...)
at launch_backend.c:268
#22 StartChildProcess (type=B_AUTOVAC_LAUNCHER) at postmaster.c:4030
#23 LaunchMissingBackgroundProcesses () at postmaster.c:3375
#24 ServerLoop () at postmaster.c:1743
#25 PostmasterMain (...) at postmaster.c:1415
#26 main (...) at main.c:231

I haven't been able to boil this down to a clean standalone repro yet -- it
seems to need the launcher to hit get_database_list() at the moment a
pg_database page is prunable and the VM fork still has to grow -- but the
path
looks pretty clear from the stack.

Regards,
Ewan

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinya Kato 2026-05-31 04:51:10 Re: Use pg_current_xact_id() instead of deprecated txid_current()
Previous Message Chao Li 2026-05-31 04:26:51 Re: Avoid leaking system path from pg_available_extensions