Re: Potential G2-item cycles under serializable isolation

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Kyle Kingsbury <aphyr(at)jepsen(dot)io>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Potential G2-item cycles under serializable isolation
Date: 2020-06-02 16:50:47
Message-ID: CAH2-WzmD__iOD-zBNDU07ZDQkha36cJ5aXdxB8BYW6=bWR2oqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jun 2, 2020 at 9:19 AM Kyle Kingsbury <aphyr(at)jepsen(dot)io> wrote:
> OK! So I've designed a variant of this test which doesn't use ON CONFLICT.
> Instead, we do a homebrew sort of upsert: we try to update the row in place by
> primary key; if we see zero records updated, we insert a new row, and if *that*
> fails due to the primary key conflict, we try the update again, under the theory
> that since we now know a copy of the row exists, we should be able to update it.
>
> https://github.com/jepsen-io/jepsen/blob/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen/stolon/append.clj#L31-L108

Thanks, but I think that this link is wrong, since you're still using
ON CONFLICT. Correct me if I'm wrong, I believe that you intended to
link to this:

https://github.com/jepsen-io/jepsen/commit/ac4956871c8227d57d11a665e43c3d68bb7d7ec1#diff-0f5b390b5cdbd8650cf39e3c3f6f365fR31-R65

> Unfortunately, I'm still seeing tons of G2-item cycles. Whatever this is, it's
> not related to ON CONFLICT.

Good to have that confirmed. Obviously we'll need to do more analysis
of the exact circumstances of the anomaly. That might take a while.

> I get the sense that the Postgres docs have already diverged from the ANSI SQL
> standard a bit, since SQL 92 only defines three anomalies (P1, P2, P3), and
> Postgres defines a fourth: "serialization anomaly".

> I can see two ways to reconcile this--one being that Postgres chose the anomaly
> interpretation of the SQL spec, and the result is... maybe internally
> inconsistent? Or perhaps one of the operations in this workload actually *is* a
> predicate operation--maybe by dint of relying on a uniqueness constraint?

You might find that "A Critique of ANSI SQL Isolation Levels" provides
useful background information:

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf

One section in particular may be of interest:

"ANSI SQL intended to define REPEATABLE READ isolation to exclude all
anomalies except Phantom. The anomaly definition of Table 1 does not
achieve this goal, but the locking definition of Table 2 does. ANSI’s
choice of the term Repeatable Read is doubly unfortunate: (1)
repeatable reads do not give repeatable results, and (2) the industry
had already used the term to mean exactly that: repeatable reads mean
serializable in several products. We recommend that another term be
found for this."

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyle Kingsbury 2020-06-02 16:58:01 Re: Potential G2-item cycles under serializable isolation
Previous Message Kyle Kingsbury 2020-06-02 16:18:52 Re: Potential G2-item cycles under serializable isolation