| From: | Ewan Young <kdbase(dot)hack(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | autovacuum launcher crash: assert in pgstat_count_io_op (IOOP_EXTEND on pg_database's VM) |
| Date: | 2026-05-31 04:36:45 |
| Message-ID: | CAON2xHNOyaN9MCZohhD_NL6as3QVhGA0SOn2Hyi9w6+Y-_1bFA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi hackers,
I was stress-testing master (commit e2b35735b00, assertions enabled) with a
workload that does a lot of DDL/DML, including creating and dropping
databases in a tight loop, and the autovacuum launcher kept crashing on me
--
every 15-40 minutes or so once it was under load:
TRAP: failed Assert("pgstat_tracks_io_op(MyBackendType, io_object,
io_context, io_op)"), File: "pgstat_io.c", Line: 74
LOG: autovacuum launcher process (PID ...) was terminated by signal 6:
Aborted
The postmaster recovers fine, but it just starts another launcher that hits
the exact same assert, so it never really gets out of the loop.
The short version: the launcher is in get_database_list(), doing its seqscan
of pg_database, and on-access pruning kicks in during the scan. Since
b46e1e54d07 ("Allow on-access pruning to set pages all-visible"),
heap_page_prune_opt() pins the visibility map unconditionally once it
decides
to prune -- before it ever checks rel_read_only. visibilitymap_pin() isn't
read-only though: if the VM page isn't there yet it extends the fork, and
pg_database has no VM fork, so we end up doing an actual relation extend
(IOOP_EXTEND) from the launcher. pgstat_tracks_io_op() says the launcher
must never do an EXTEND, hence the assertion.
What surprised me is that the launcher's catalog scan isn't even flagged
read-only (table_beginscan_catalog doesn't set SO_HINT_REL_READ_ONLY),
so it never actually intends to set the VM -- it just pins/extends it
anyway.
Here are the relevant frames:
#3 ExceptionalCondition ("pgstat_tracks_io_op(...)", "pgstat_io.c", 74)
at assert.c:65
#4 pgstat_count_io_op (io_object=IOOBJECT_RELATION,
io_context=IOCONTEXT_NORMAL, io_op=IOOP_EXTEND, cnt=1, bytes=8192)
at pgstat_io.c:74
#5 pgstat_count_io_op_time (...) at pgstat_io.c:160
at bufmgr.c:3030
#7 ExtendBufferedRelCommon (... fork=VISIBILITYMAP_FORKNUM ...)
at bufmgr.c:2774
#8 ExtendBufferedRelTo (... fork=VISIBILITYMAP_FORKNUM, extend_to=1 ...)
at bufmgr.c:1099
#9 vm_extend (vm_nblocks=1, ...) at visibilitymap.c:614
#10 vm_readbuf (blkno=0, extend=true) at visibilitymap.c:572
#11 visibilitymap_pin (...) at visibilitymap.c:216
#12 heap_page_prune_opt (..., rel_read_only=...) at pruneheap.c:339
#13 heap_prepare_pagescan (...) at heapam.c:638
#14 heapgettup_pagemode (... ForwardScanDirection ...) at heapam.c:1113
#15 heap_getnext (...) at heapam.c:1454
#16 get_database_list () at autovacuum.c:1856
#17 do_start_worker () at autovacuum.c:1172
#19 launch_worker (...) at autovacuum.c:1355
#20 AutoVacLauncherMain (...) at autovacuum.c:780
#21 postmaster_child_launch (child_type=B_AUTOVAC_LAUNCHER, ...)
at launch_backend.c:268
#22 StartChildProcess (type=B_AUTOVAC_LAUNCHER) at postmaster.c:4030
#23 LaunchMissingBackgroundProcesses () at postmaster.c:3375
#24 ServerLoop () at postmaster.c:1743
#25 PostmasterMain (...) at postmaster.c:1415
#26 main (...) at main.c:231
I haven't been able to boil this down to a clean standalone repro yet -- it
seems to need the launcher to hit get_database_list() at the moment a
pg_database page is prunable and the VM fork still has to grow -- but the
path
looks pretty clear from the stack.
Regards,
Ewan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Shinya Kato | 2026-05-31 04:51:10 | Re: Use pg_current_xact_id() instead of deprecated txid_current() |
| Previous Message | Chao Li | 2026-05-31 04:26:51 | Re: Avoid leaking system path from pg_available_extensions |