Re: btree vacuum and suspended scans can deadlock

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree vacuum and suspended scans can deadlock
Date: 2016-10-14 21:00:23
Message-ID: CACjxUsNtBXe1OfRp=acB+8QFAVWJ=nr55_HMmqQYceCzVGF4tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 13, 2016 at 4:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I was about to suggest that maybe we didn't need cleanup locks in btree
> indexes anymore now that SnapshotNow is gone, but I see that somebody
> already altered nbtree/README to say this:
>
> : Therefore, a scan using an MVCC snapshot which has no other confounding
> : factors will not hold the pin after the page contents are read. The
> : current reasons for exceptions, where a pin is still needed, are if the
> : index is not WAL-logged or if the scan is an index-only scan.
>
> This is one of the saddest excuses for documentation I've ever seen,
> because it doesn't explain WHY either of those conditions require a VACUUM
> interlock, and certainly it's not immediately obvious why they should.
> "git blame" pins the blame for this text on Kevin, so I'm going to throw
> it up to him to explain himself.

Going back to old posts to confirm the reasoning at the time, I
find this:

The reason unlogged tables are an issue is that when a pin is not
held for the index page, TIDs may be reused before we move to the
next page; LP_DEAD hinting (one of the last things done with the
old page before moving to the next page) would not work correctly
in such a case. We work around that by storing the page LSN into
the scan position structure when the page contents are read, and
only doing hinting if that matches the current LSN for the page
when we are ready to do the hinting. That won't work for an index
which is not WAL-logged, since the LSN is not set, so we hold pins
for those.

Visibility information for an index-only scan isn't checked while
the index page READ lock is held, so so it appears that some work
is needed to change that before such scans can drop the pins.

Would you like me to add something to that effect into the README
now, or would you prefer to take it from here?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2016-10-14 21:50:20 Re: [COMMITTERS] pgsql: Remove spurious word.
Previous Message Corey Huinker 2016-10-14 20:56:17 Re: COPY as a set returning function