Re: [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages]

From: Ian Jackson <ian(dot)jackson(at)eu(dot)citrix(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: <xen-devel(at)lists(dot)xenproject(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation [and 2 more messages]
Date: 2016-12-13 11:30:28
Message-ID: 22607.56276.807567.924144@mariner.uk.xensource.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks to everyone for your attention.

Kevin Grittner writes ("Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation"):
> On Mon, Dec 12, 2016 at 8:45 AM, Ian Jackson <ian(dot)jackson(at)eu(dot)citrix(dot)com> wrote:
> > AIUI the documented behavour is that "every set of successful
> > transactions is serialisable".
>
> Well, in context that is referring to serializable transactions.
> No such guarantee is provided for other isolation levels.

Indeed.

> I didn't [get the same results]. First, I got this when I tried to
> start the concurrent transactions using the example as provided:

I'm sorry, I didn't actually try my paraphrased repro recipe, so it
contained an error:

> test=# SELECT count(*) FROM t WHERE k=1; -- save value

My Perl code said "k=?" and of course ? got (effectively) substituted
with "'1'" rather than "1". I introduced the error when transcribing
the statements from my Perl script to my prose. Sorry about that.
Next time I will test my alleged recipe with psql.

I should also have told you that I was running this on 9.1.

Kevin Grittner writes ("Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation"):
> On Mon, Dec 12, 2016 at 12:32 PM, Kevin Grittner <kgrittn(at)gmail(dot)com> wrote:
> > As you can see, this generated a serialization failure.
>
> That was on 9.6. On earlier versions it does indeed allow the
> transaction on connection 2 to commit, yielding a non-serializable
> result. This makes a pretty strong case for back-patching this
> commit:

Right. That's part of my point. Thanks.

[back to the first message]
> If you have some way to cause a set of concurrent serializable
> transactions to generate results from those transactions which
> commit which is not consistent with some one-at-a-time order of
> execution, I would be very interested in seeing the test case.
> The above, however, is not it.

I am concerned that there are other possible bugs of this form.
In earlier messages on this topic, it has been suggested that the
"impossible" unique constraint violation is only one example of a
possible "leakage".

Earlier you wrote:

If I recall correctly, the constraints for which there can be
errors appearing due to concurrent transactions are primary key,
unique, and foreign key constraints. I don't remember seeing it
happen, but it would not surprise me if an exclusion constraint can
also cause an error due to a concurrent transaction's interaction
with the transaction receiving the error.

Are all of these cases fixed by fcff8a57519847 "Detect SSI conflicts
before reporting constraint violations" ?

I can try playing around with other kind of constraints, to try to
discover different aspects or versions of this bug, but my knowledge
of the innards of databases is very limited and I may not be
particularly effective. Certainly if I try and fail, I wouldn't have
confidence that no such bug existed.

And,

Thomas Munro writes ("Re: [HACKERS] [OSSTEST PATCH 0/1] PostgreSQL db: Retry on constraint violation"):
> Ian's test case uses an exception handler to convert a difference in
> error code into a difference in committed effect, thereby converting a
> matter of programmer convenience into a bug.

Precisely. I think this is a fully general technique, which means
that any situation where a transaction can "spuriously" fail is a
similar bug. So I think that ISOLATION LEVEL SERIALIZABLE needs to do
what a naive programmer would expect:

All statements in such transactions, even aborted transactions, need
to see results, and have behaviour, which are completely consistent
with some serialisaton of all involved transactions. This must apply
up to (but not including) any serialisation failure error.

If that is the behaviour of 9.6 then I would like to submit a
documentation patch which says so. If the patch is to be backported,
then this ought to apply to all (patched) 9.x versions ?

It would be nice if the documentation stated the error codes that
might be generated. AFAICT that's just 40P01 and 40001 ? (I'm not
sure what 40002 is.)

> For the record, read-write-unique-4.spec's permutation r2 w1 w2 c1 c2
> remains an open question for further work.

Is this another possible bug of this form ?

Ian.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-12-13 11:44:32 ToDo: no blocking (waiting) DDL
Previous Message Kyotaro HORIGUCHI 2016-12-13 10:15:32 Re: Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)