Re: SSI patch version 14

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Dan Ports" <drkp(at)csail(dot)mit(dot)edu>,"Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SSI patch version 14
Date: 2011-01-27 18:22:45
Message-ID: 4D4163950200002500039EBE@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> To clarify, this means that it will get some false positives,
> right?

Yes. But the example you're about to get into isn't one of them.

> For instance:
>
> T1:
> get snapshot
>
> T2:
> get snapshot
> insert R1
> commit
>
> T1:
> read R1
> write R2
>
> T3:
> get snapshot
> read R2
>
> T3:
> commit
>
> T1:
> commit -- throws error
>
>
> T1 has a conflict out to T2, and T1 has a conflict in from T3.
> T2 has a conflict in from T1.
> T3 has a conflict out to T1.
>
> T1 is canceled because it has both a conflict in and a conflict
> out. But the results are the same as a serial order of execution:
> T3, T1, T2.
>
> Is there a reason we can't check for a real cycle, which would let
> T1 succeed?

Yes. Because T2 committed before T3 started, it's entirely possible
that there is knowledge outside the database server that the work of
T2 was done and committed before the start of T3, which makes the
order of execution: T2, T3, T1, T2. So you can have anomalies. Let
me give you a practical example.

Pretend there are receipting terminals in public places for the
organization. In most financial systems, those receipts are
assigned to batches of some type. Let's say that's done by an
insert for the new batch ID, which closes the old batch. Receipts
are always added with the maximum batch ID, reflecting the open
batch.

Your above example could be:

-- setup
test=# create table ctl (batch_id int not null primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"ctl_pkey" for table "ctl"
CREATE TABLE
test=# create table receipt (batch_id int not null, amt
numeric(13,2) not null);
CREATE TABLE
test=# insert into ctl values (1),(2),(3);
INSERT 0 3
test=# insert into receipt values ((select max(batch_id) from ctl),
50),((select max(batch_id) from ctl), 100);
INSERT 0 2

-- receipting workstation
-- T1 starts working on receipt insert transaction
test=# begin transaction isolation level repeatable read;
BEGIN
test=# select 1; -- to grab snapshot, per above
?column?
----------
1
(1 row)

-- accounting workstation
-- T2 closes old receipt batch; opens new
test=# begin transaction isolation level repeatable read;
BEGIN
test=# insert into ctl values (4);
INSERT 0 1
test=# commit;
COMMIT

-- receipting workstation
-- T1 continues work on receipt
test=# select max(batch_id) from ctl;
max
-----
3
(1 row)

test=# insert into receipt values (3, 1000);
INSERT 0 1

-- accounting workstation
-- T3 lists receipts from the closed batch
-- (Hey, we inserted a new batch_id and successfully
-- committed, right? The old batch is closed.)
test=# begin transaction isolation level repeatable read;
BEGIN
test=# select * from receipt where batch_id = 3;
batch_id | amt
----------+--------
3 | 50.00
3 | 100.00
(2 rows)

test=# commit;
COMMIT

-- receipting workstation
-- T1 receipt insert transaction commits
test=# commit;
COMMIT

Now, with serializable transactions, as you saw, T1 will be rolled
back. With a decent software framework, it will be automatically
retried, without any user interaction. It will select max(batch_id)
of 4 this time, and the insert will succeed and be committed.
Accounting's list is accurate.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-01-27 18:39:35 Re: SSI patch version 14
Previous Message David E. Wheeler 2011-01-27 17:38:03 Re: Upcoming back-branch updates