Re: btree vacuum and suspended scans can deadlock

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree vacuum and suspended scans can deadlock
Date: 2016-10-13 21:44:29
Message-ID: 21970.1476395069@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Oct 13, 2016 at 6:33 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> If we agree that above is a problematic case, then some of the options
>> to solve it could be (a) Vacuum should not wait for a cleanup lock and
>> instead just give up and start again which I think is a bad idea (b)
>> don't allow to take lock of higher granularity after the scan is
>> suspended, not sure if that is feasible (c) document the above danger,
>> this sounds okay on the ground that nobody has reported the problem
>> till now

> I don't think any of these sound particularly good.

Note that it's a mistake to imagine that this is specific to indexes;
the same type of deadlock risk exists just when considering heap cleanup.
We've ameliorated the heap case quite a bit by recognizing situations
where it's okay to skip a page and move on, but it's not gone.
Unfortunately indexes don't get to decide that deletion is optional.

I was about to suggest that maybe we didn't need cleanup locks in btree
indexes anymore now that SnapshotNow is gone, but I see that somebody
already altered nbtree/README to say this:

: Therefore, a scan using an MVCC snapshot which has no other confounding
: factors will not hold the pin after the page contents are read. The
: current reasons for exceptions, where a pin is still needed, are if the
: index is not WAL-logged or if the scan is an index-only scan.

This is one of the saddest excuses for documentation I've ever seen,
because it doesn't explain WHY either of those conditions require a VACUUM
interlock, and certainly it's not immediately obvious why they should.
"git blame" pins the blame for this text on Kevin, so I'm going to throw
it up to him to explain himself.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2016-10-13 22:41:31 Re: [COMMITTERS] pgsql: Remove spurious word.
Previous Message Robert Haas 2016-10-13 21:22:16 Re: btree vacuum and suspended scans can deadlock