| From: | Ewan Young <kdbase(dot)hack(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, melanieplageman(at)gmail(dot)com |
| Subject: | Re: autovacuum launcher crash: assert in pgstat_count_io_op (IOOP_EXTEND on pg_database's VM) |
| Date: | 2026-06-01 08:41:35 |
| Message-ID: | CAON2xHOda4QZ+gNCwpMJiiEpByDSxqyU-aQYFz1LJYjWrom5Rw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
That was exactly the right neighborhood, thanks for the quick pointer.
The mechanism, as far as I can tell (I'm new to this code, so corrections
welcome):
The autovacuum launcher scans pg_database in get_database_list() with a
catalog scan (rel_read_only = false). On a full, prunable page,
heap_page_prune_opt() calls visibilitymap_pin(), which extends the VM fork
if it doesn't exist. The launcher isn't allowed to do IOOP_EXTEND, so
pgstat_count_io_op() trips the assertion at pgstat_io.c:74 and it aborts.
Two commits combine to cause it: 4f7ecca84dd added the unconditional
(extending) visibilitymap_pin() in the on-access prune path, and 378a216187a
made INSERT set pd_prune_xid, so on-access pruning now fires on
insert-mostly
catalogs like pg_database. That's also why it was hard to reduce: any
regular backend or autovacuum worker that scans pg_database recreates the
fork harmlessly (they may extend), so the launcher only crashes in the brief
window before that happens. I can now reproduce it deterministically; happy
to share the script.
Patch attached (one file). Only read-only scans actually set the VM, and
those run only in regular backends (which may extend), so it extends the
fork only on read-only scans and, for other scans, pins an existing VM page
without extending (via visibilitymap_get_status()). Corruption detection is
unaffected; we just leave fork creation to the next VACUUM. Passes
check/installcheck, isolation, and contrib/pg_visibility.
This is just my own attempt at a fix and I'm not sure it's correct, so
please don't hesitate to point out anything I've got wrong.
Thanks,
Ewan
On Mon, Jun 1, 2026 at 12:08 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Sun, May 31, 2026 at 12:36:45PM +0800, Ewan Young wrote:
> > I haven't been able to boil this down to a clean standalone repro yet --
> it
> > seems to need the launcher to hit get_database_list() at the moment a
> > pg_database page is prunable and the VM fork still has to grow -- but the
> > path looks pretty clear from the stack.
>
> My first suspicion would be something in the area of b46e1e54d078 and
> some of the VM improvements. Melanie?
> --
> Michael
>
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Don-t-extend-the-visibility-map-fork-during-non-r.patch | application/octet-stream | 3.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bertrand Drouvot | 2026-06-01 09:21:39 | Re: Avoid orphaned objects dependencies, take 3 |
| Previous Message | Richard Guo | 2026-06-01 08:27:06 | Wrong unsafe-flag test in check_output_expressions() |