Skip site navigation (1) Skip section navigation (2)

Re: SSI-related code drift between index_getnext() and heap_hot_search_buffer()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SSI-related code drift between index_getnext() and heap_hot_search_buffer()
Date: 2011-05-14 14:18:42
Message-ID: BANLkTik8KxxjJ1KW-pO+WWBdTEAT+80ArQ@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Fri, May 13, 2011 at 12:10 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> FWIW, so far what I know is that it will take an example something
> like the one shown here:
>
> http://archives.postgresql.org/pgsql-hackers/2011-02/msg00325.php
>
> with the further requirements that the update in T3 must not be a
> HOT update, T1 would still need to acquire a snapshot before T2
> committed while moving its current select down past the commit of
> T3, and that select would need to be modified so that it would scan
> the visible tuple and then stop (e.g., because of a LIMIT) before
> reaching the tuple which represents the next version of the row.

I think I see another problem here.  Just before returning each tuple,
index_getnext() records in the IndexScanDesc the offset number of the
next tuple in the HOT chain, and the XMAX of the tuple being returned.
 On the next call, it will go on to examine that TID checking, among
other things, whether the XMIN of the tuple at that location matches
the previously stored XMAX.  But no buffer content locks is held
across calls.  So consider a HOT chain A -> B.  After returning A, the
IndexScanDesc will consider that we should next look at B.  Now B
rolls back, and a new transaction updates A, so we now have A -> C.
(I believe this is possible.)  When the next call to index_getnext()
occurs, it'll look at B and consider that it's reached the end of the
HOT chain - but in reality it has not, because it has never looked at
C.

Now, prior to SSI, I believe this did not matter, because the only
time we traversed the entire HOT chain rather than stopping at the
first visible tuple was when we were using a non-MVCC snapshot.
According to Heikki's submission notes for the patch I was trying to
rebase, the only time that happens is during CLUSTER, at which point
we have an AccessExclusiveLock on the table.  But SSI wants to
traverse the whole HOT chain even when using an MVCC snapshot, so now
we (maybe) have a problem.

I think I have an inkling of how to plug this, but first I have to go
buy groceries.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2011-05-14 15:14:27
Subject: Re: Reducing overhead of frequent table locks
Previous:From: Robert HaasDate: 2011-05-14 14:01:01
Subject: Re: Reducing overhead of frequent table locks

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group