Re: Potential G2-item cycles under serializable isolation

From: Kyle Kingsbury <aphyr(at)jepsen(dot)io>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Potential G2-item cycles under serializable isolation
Date: 2020-06-05 13:02:58
Message-ID: e1c7919d-2dc5-4099-e825-9bda876f2e82@jepsen.io
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 6/4/20 8:20 PM, Peter Geoghegan wrote:
> Can you explain the anomaly with reference to the actual SQL queries
> executed in the log? Is the information that I've provided sufficient?

Elle automatically explains these anomalies, so it might be easier to interpret
with the visualization and/or interpretation of the anomaly--you'll find those
in store/latest/elle/. Looking at the data structure representation of the
anomaly, here's what I see in jepsen-failure-2.txt. I'm looking at the first
(and only) anomaly in :workload/:anomalies/:G2-item.

{:cycle
     [{:type :ok,
       :f :txn,
       :value
       [[:r 100 nil] [:append 101 1] [:r 101 [1]] [:r 96 [1 2]]],
       :time 750303448,
       :process 23,
       :index 727}
      {:type :ok,
       :f :txn,
       :value [[:append 100 1] [:r 99 nil] [:r 101 nil] [:append 4 4]],
       :time 751922048,
       :process 5,
       :index 729}
      {:type :ok,
       :f :txn,
       :value
       [[:r 100 nil] [:append 101 1] [:r 101 [1]] [:r 96 [1 2]]],
       :time 750303448,
       :process 23,
       :index 727}],
     :steps
     ({:type :rw,
       :key 100,
       :value :elle.list-append/init,
       :value' 1,
       :a-mop-index 0,
       :b-mop-index 0}
      {:type :rw,
       :key 101,
       :value :elle.list-append/init,
       :value' 1,
       :a-mop-index 2,
       :b-mop-index 1}),
     :type :G2-item}

We have a cycle involving two transactions (:cycle)--the first transaction is
shown at the end as well to make it easier to see that it closes the loop. Let's
call them T1 (:index 727) and T2 (index :729). The first step (:steps) is an
anti-dependency: T1 read key 100 and observed the initial state ([:r 100 nil]),
but T2 appended 1 to 100, so we know T1 < T2. The second step is also an
anti-dependency: T2 read key 101 and observed the initial state ([:r 101 nil]).
Both of these transactions completed successfully (:type :ok), which means we
have two committed transactions, both of which failed to observe the other's
writes. These can't be serializable, because if we choose T1 < T2, T2 fails to
observe a write that T1 performed, and if we choose T2 < T1, T1 fails to observe
a write that T2 performed. You'll probably see something similar to this
explanation in store/latest/elle/G2-item.txt, and a corresponding diagram in
store/latest/elle/G2-item/0.svg.

Elle infers, from the presence of G2-item, that this history cannot satisfy
repeatable read (:not #{:repeatable-read}). By extension, it knows (:also-not)
that it can't be serializable, strict serializable, or strong session
serializable either.

Now, to the SQL. T1 was performed by process 23, and T2 by process 5--you've
helpfully restricted the postgres logs to just those two processes, so you're
one step ahead of me. I searched for $1 = '100' to find the read of key 100 from
process 23:

1591312493.044 171762 0 BEGIN 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: BEGIN
1591312493.044 171762 0 SELECT 5ed9806c.29ef2 jepsen process 23 DETAIL: 
parameters: $1 = '100'
1591312493.044 171762 0 SELECT 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: select (val) from txn2 where sk = $1

So, process 23 begins a transaction and reads key 100 from table txn2, using the
secondary key `sk`. This is new, BTW--earlier versions of the test did every
operation by primary key; I've been making the test harder over time. That read
returns no rows: [:r 100 nil].

Process 23, at least in this log, goes on to set the session to SERIALIZABLE.
This is a bit weird, because that statement should be executed by JDBC when we
started the transaction, right? Makes me wonder if maybe the log lines are out
of order? Now that I look... these are suspiciously alphabetically ordered. If
you sorted the file, that might explain it. :)

1591312493.044 171762 0 SET 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE
1591312493.044 171762 0 SHOW 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: SHOW TRANSACTION ISOLATION LEVEL

Next (in this sorted log, but not in reality), process 23 appends 1 to key 101,
in table txn0:

1591312493.045 171762 0 INSERT 5ed9806c.29ef2 jepsen process 23 DETAIL: 
parameters: $1 = '101', $2 = '101', $3 = '1'
1591312493.045 171762 0 INSERT 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: insert into txn0 (id, sk, val) values ($1, $2, $3)

There should have been an update first, but it got sorted later in this file.
Here's the update--this must have happened first, failed, and caused process 23
to back off to an insert.

1591312493.045 171762 0 SAVEPOINT 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: savepoint upsert
1591312493.045 171762 0 UPDATE 5ed9806c.29ef2 jepsen process 23 DETAIL: 
parameters: $1 = '1', $2 = '101'
1591312493.045 171762 0 UPDATE 5ed9806c.29ef2 jepsen process 23 LOG:  execute
<unnamed>: update txn0 set val = CONCAT(val, ',', $1) where id = $2

Next, process 5 begins its transaction:

1591312493.046 171734 0 BEGIN 5ed9806c.29ed6 jepsen process 5 LOG:  execute
<unnamed>: BEGIN
1591312493.046 171734 0 SET 5ed9806c.29ed6 jepsen process 5 LOG: execute
<unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE
1591312493.046 171734 0 SHOW 5ed9806c.29ed6 jepsen process 5 LOG: execute
<unnamed>: SHOW TRANSACTION ISOLATION LEVEL

... and tries to append 1 to key 100 by primary key `id`:

1591312493.046 171734 0 UPDATE 5ed9806c.29ed6 jepsen process 5 DETAIL: 
parameters: $1 = '1', $2 = '100'
1591312493.046 171734 0 UPDATE 5ed9806c.29ed6 jepsen process 5 LOG:  execute
<unnamed>: update txn2 set val = CONCAT(val, ',', $1) where id = $2

Process 23 goes on to release its upsert savepoint, and perform reads of key 101
and 96, observing [1] and [1 2] respectively, before committing. Again,
out-of-order log due to sorting.

1591312493.046 171762 1197960 COMMIT 5ed9806c.29ef2 jepsen process 23 LOG: 
execute <unnamed>: COMMIT
1591312493.046 171762 1197960 RELEASE 5ed9806c.29ef2 jepsen process 23 LOG: 
execute <unnamed>: release savepoint upsert
1591312493.046 171762 1197960 SELECT 5ed9806c.29ef2 jepsen process 23 DETAIL: 
parameters: $1 = '101'
1591312493.046 171762 1197960 SELECT 5ed9806c.29ef2 jepsen process 23 DETAIL: 
parameters: $1 = '96'
1591312493.046 171762 1197960 SELECT 5ed9806c.29ef2 jepsen process 23 LOG: 
execute <unnamed>: select (val) from txn0 where id = $1
1591312493.046 171762 1197960 SELECT 5ed9806c.29ef2 jepsen process 23 LOG: 
execute <unnamed>: select (val) from txn0 where sk = $1

Process 5's update affects zero rows, so it similarly backs off to an insert
statement:

1591312493.047 171734 0 INSERT 5ed9806c.29ed6 jepsen process 5 DETAIL: 
parameters: $1 = '100', $2 = '100', $3 = '1'
1591312493.047 171734 0 INSERT 5ed9806c.29ed6 jepsen process 5 LOG:  execute
<unnamed>: insert into txn2 (id, sk, val) values ($1, $2, $3)
1591312493.047 171734 0 SAVEPOINT 5ed9806c.29ed6 jepsen process 5 LOG:  execute
<unnamed>: savepoint upsert
1591312493.047 171734 1197963 RELEASE 5ed9806c.29ed6 jepsen process 5 LOG: 
execute <unnamed>: release savepoint upsert

... followed by reads of 99 and 101.

1591312493.047 171734 1197963 SELECT 5ed9806c.29ed6 jepsen process 5 DETAIL: 
parameters: $1 = '101'
1591312493.047 171734 1197963 SELECT 5ed9806c.29ed6 jepsen process 5 DETAIL: 
parameters: $1 = '99'
1591312493.047 171734 1197963 SELECT 5ed9806c.29ed6 jepsen process 5 LOG: 
execute <unnamed>: select (val) from txn0 where sk = $1
1591312493.047 171734 1197963 SELECT 5ed9806c.29ed6 jepsen process 5 LOG: 
execute <unnamed>: select (val) from txn2 where id = $1

We know the read of 101 observed no rows, which makes sense because process 5
(T2)'s transaction began during process 23's transaction, *before* the commit.
What *doesn't* make sense is...

1591312493.048 171734 1197963 COMMIT 5ed9806c.29ed6 jepsen process 5 LOG: 
execute <unnamed>: COMMIT

Process 5 manages to *commit* despite the anti-dependency cycle! That's legal
under SI, but not under serializability.

Phew! I know it's a lot. Does this help?

--Kyle

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-06-05 16:45:47 Re: pgbench bug / limitation
Previous Message PG Bug reporting form 2020-06-05 12:47:43 BUG #16482: Input as table type to function is getting failed