Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-12-11 22:51:35
Message-ID: CAH2-WzkzuuY0V9xE10LJS1bBH=GMuGrV7BMf3q3wQyt-ABFVrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Dec 11, 2021 at 2:03 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I have no objection to delaying the lazy_scan_heap_limits() stuff
> until right before lazy_scan_heap() is called. However, I do think
> that we should always know certain basic "immutable" facts about a
> VACUUM at the point that we call lazy_scan_heap(), which is not the
> case with this patch.
>
> Honestly, I'm surprised that you see much value in delaying the
> lazy_scan_heap_limits() stuff until the very last microsecond. How
> many microseconds could we possibly delay it by?

Have you thought about the implications for the ongoing work to set
pg_class.relfrozenxid to the oldest observed XID in the table, instead
of just using FreezeLimit naively (which I've prototyped but haven't
posted)?

What if the target heap relation gets extended after we've established
nblocks/rel_pages for the lazyvacuum.c operation (by calling
RelationGetNumberOfBlocks()), but before we get to the
lazy_scan_heap_limits() stuff for the same operation? What if there
are a small number of heap pages at the end of the relation that we
won't get to at all in the ongoing VACUUM? They could have heap tuples
whose header XIDs are from just before our OldestXmin cutoff. I
believe it follows that we cannot miss them (at least not in an
aggressive VACUUM, maybe not ever with my patch).

--
Peter Geoghegan

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2021-12-12 16:00:01 BUG #17334: Assert failed inside computeDistance() on gist index scanning
Previous Message Peter Geoghegan 2021-12-11 22:03:36 Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum