Re: incoherent view of serializable transactions

From: Emmanuel Cecchet <manu(at)frogthinker(dot)org>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: incoherent view of serializable transactions
Date: 2008-12-23 05:42:06
Message-ID: 49507A2E.6010209@frogthinker.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kevin,

If you want to know how to build SERIALIZABLE with a database that
provides SI (Snapshot Isolation), read
http://portal.acm.org/citation.cfm?doid=1376616.137669
Note that in practice, READ COMMITTED is the most largely used isolation
level and its limitations are relatively well understood by the average
programmer that can program his application accordingly. I still don't
get why people would use SERIALIZABLE since there is no efficient
implementation of it.

My 2 cents.
Emmanuel

Kevin Grittner wrote:
> As I've understood limitations of the PostgreSQL implementation of
> SERIALIZABLE transactions, at least the only example given in the
> documentation, revolve around a rather unlikely situation:
>
> Given concurrent transactions T1 and T2 and non-overlapping sets of
> data A and B, T1 reads data including A and uses the data to modify B
> while T2 reads data including B and uses that data to modify A, where
> the modifications performed by either would affect the modifications
> made by the other, if they were visible.
>
> For reasons I'll omit here, that scenario didn't worry me for my
> current uses of PostgreSQL.
>
> I've found another form of deviation from the standard SERIALIZABLE
> behavior, though, which does worry me. Although the above appears to be
> the only situation where the end result after everything commits is
> inconsistent with standard SERIALIZABLE behavior, the PostgreSQL
> implementation allows transactions to view the data in states which
> would never be possible during the application of the transactions in
> series in the order they will appear to have been applied after the
> commit.
>
> Imagine, as an example, a system which involves recording receipts,
> each of which must go into a daily deposit. There is a control table
> with one row containing the current deposit date for receipts.
> Somewhere mid-afternoon that date is updated, all subsequent receipts
> fall into the new day, and a report is run listing the receipts for the
> day and giving the deposit total.
>
> Under a standard-compliant implementation of SERIALIZABLE, this is
> straightforward: a transaction which is inserting a receipt selects the
> deposit date to use in its transaction, and any SELECT of receipts for a
> date prior to the current deposit date will see the accurate, final
> data. Under the PostgreSQL implementation, although data eventually
> gets to a coherent state, there can be a window of time where a SELECT
> can return an incomplete list of receipts for a date which appears to be
> closed, even if all transactions for modifying and viewing data are
> SERIALIZABLE.
>
> -- setup
> create table ctl (k text not null primary key, deposit_date date not
> null);
> insert into ctl values ('receipt', date '2008-12-22');
> create table receipt (receipt_no int not null primary key, deposit_date
> date not null, amount numeric(13,2));
> insert into receipt values (1, (select deposit_date from ctl where k =
> 'receipt'), 1.00);
> insert into receipt values (2, (select deposit_date from ctl where k =
> 'receipt'), 2.00);
>
> -- connection 1
> start transaction isolation level serializable ;
> insert into receipt values (3, (select deposit_date from ctl where k =
> 'receipt'), 4.00);
>
> -- connection 2
> start transaction isolation level serializable ;
> update ctl set deposit_date = date '2008-12-23' where k = 'receipt';
> commit transaction;
> start transaction isolation level serializable ;
> select * from ctl;
> -- (deposit_date shows as 2008-12-23)
> select * from receipt;
> -- (Only receipts 1 and 2 show for 2008-12-22.)
> commit;
>
> -- connection 1
> commit transaction;
>
> -- connection 2
> start transaction isolation level serializable ;
> select * from receipt;
> -- (All receipts for the 2008-12-22 deposit date now show.)
> commit transaction;
>
> At this point, SERIALIZABLE transactions appear to have worked, with
> receipt 3 happening before the update of deposit_date; however, there
> was a window of time when the update to deposit_date was visible and
> receipt 3 was not.
>
> This absolutely can't happen in a standard-compliant implementation.
> At a minimum, this window where visible data lacks coherency should be
> noted in the documentation. I don't know if there's any way to fix
> this without killing performance.
>
> -Kevin
>
>

--
Emmanuel Cecchet
FTO @ Frog Thinker
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu(at)frogthinker(dot)org
Skype: emmanuel_cecchet

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2008-12-23 06:03:08 Re: Visibility map and freezing
Previous Message Tom Lane 2008-12-23 05:33:03 Re: Some semantic details of the window-function spec