Re: WIP: Detecting SSI conflicts before reporting constraint violations

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Detecting SSI conflicts before reporting constraint violations
Date: 2016-03-11 05:31:37
Message-ID: CAEepm=2kYCegxp9qMR5TM1X3oXHj16iYzLPj_go52R2R07EvnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 11, 2016 at 10:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 10 March 2016 at 20:36, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
> wrote:
>>
>> On Fri, Mar 11, 2016 at 8:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
>> wrote:
>> > On 3 February 2016 at 23:12, Thomas Munro
>> > <thomas(dot)munro(at)enterprisedb(dot)com>
>> > wrote:
>> >
>> >>
>> >> It quacks suspiciously like a bug.
>> >
>> >
>> > Agreed
>> >
>> > What's more important is that is very publicly a bug in the eyes of
>> > others
>> > and should be fixed and backpatched soon.
>> >
>> > We have a maintenance release coming in a couple of weeks and I'd like
>> > to
>> > see this in there.
>>
>> As I understand it, the approach I've taken here can't be backpatched
>> because it changes the aminsert_function interface (it needs the
>> current snapshot when inserting), so I was proposing this as an
>> improvement for 9.6. I guess there are other way to get the right
>> snapshot into btinsert (and thence _bt_check_unique), but I didn't
>> think it would be very classy to introduce a 'current snapshot' global
>> variable to smuggle it in.
>
>
> But this is a Serializable transaction, so it only has one snapshot...
>
> This is where adding comments on patch theory would help.

Here's a much simpler version with more comments, and an explanation
below. It handles the same set of isolation test specs.

I dropped the calls to PredicateLockPage and
CheckForSerializableConflictOut, which means I don't even need a
snapshot (and if I ever do, you're right, doh, I should just use
GetTransactionSnapshot()). They were part of my (so far unsuccessful)
attempt to detect a conflict for read-write-unique-4.spec permutation
"r2 w1 w2 c1 c2", where one transaction only writes. That leaves
only a "hypothetical" CheckForSerializableConflictIn check (see
below).

This version seems to handle the obvious overlapping read-then-write
scenarios I've contrived and seen in bug reports. I need to do more
study of the SSI code and theory, and see if there are other
conflicting schedules that are missed but could be detected at this
point. (Possibly including that "r2 w1 w2 c1 c2" case.)

Explanation:

In unpatched master, when _bt_check_unique discovers that a key
already exists in the index, it checks if the heap tuple is live by
calling heap_hot_search, which in turn calls heap_hot_search_buffer,
and then throws away the buffer and returns true or false. If the
result is true, a unique constraint violation is raised.

This patch introduces a drop-in replacement
check_unique_tuple_still_live to call instead of heap_hot_search. The
replacement function also calls heap_hot_search_buffer, but while it
has the buffer it takes the opportunity to do an SSI conflict-in
check. At that point we know that the transaction is already doomed
to ereport, it's just a question of figuring out whether it should
ereport 40001 first.

The new conflict-in check tells the SSI machinery that we are going to
insert at this index page, even though we aren't. It mirrors the one
that happens right before _bt_findinsertloc and _bt_insertonpg, which
would be reached if this were a non-unique index. It gives the SSI
machinery a chance to detect conflicts that would be created by
writing to this page. If it doesn't detect a conflict, behaviour is
unchanged and the usual unique constraint violation will be raised.

I'm not sure what to make of the pre-existing comment about following
HOT-chains and concurrent index builds (which I moved). Does it mean
there is some way that CREATE INDEX CONCURRENTLY could cause us to
consider the wrong tuple and miss an SSI conflict?

(Tangentially, I believe the nearby ON CONFLICT code needs some SSI
checks and I may investigate that some time if someone doesn't beat me
to it, but not as part of this patch.)

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
ssi-read-write-unique-v3.patch application/octet-stream 12.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-03-11 05:31:56 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Joshua D. Drake 2016-03-11 04:52:05 Re: Proposal: RETURNING primary_key()