Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>
Subject: Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?
Date: 2021-12-09 01:40:37
Message-ID: CAH2-Wzm22DiSi7NFJvt3E81PjgqzdLAftoRCKEGQsu_x0wnOug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 30, 2021 at 5:09 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Attached draft patch attempts to explain things in this area within
> the nbtree README. There is a much shorter comment about it within
> vacuumlazy.c. I am concerned about GiST index-only scans themselves,
> of course, but I discovered this issue when thinking carefully about
> the concurrency rules for VACUUM -- I think it's valuable to formalize
> and justify the general rules that index access methods must follow.

I pushed a commit that described how this works for nbtree, in the README file.

I think that there might be an even more subtle race condition in
nbtree itself, though, during recovery. We no longer do a "pin scan"
during recovery these days (see commits 9f83468b, 3e4b7d87, and
687f2cd7 for full information). I think that it might be necessary to
do that, just for the benefit of index-only scans -- if it's necessary
during original execution, then why not during recovery?

The work to remove "pin scans" was justified by pointing out that we
no longer use various kinds of snapshots during recovery, but it said
nothing about index-only scans, which need the TID recycling interlock
(i.e. need to hold onto a leaf page while accessing the heap in sync)
even with an MVCC snapshot. It's easy to imagine how it might have
been missed: nobody ever documented the general issue with index-only
scans, until now. Commit 2ed5b87f recognized they were unsafe for the
optimization that it added (to avoid blocking VACUUM), but never
explained why they were unsafe.

Going back to doing pin scans during recovery seems deeply
unappealing, especially to fix a totally narrow race condition.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-12-09 01:59:02 RE: row filtering for logical replication
Previous Message Andres Freund 2021-12-09 01:22:27 Re: port conflicts when running tests concurrently on windows.