Re: Postgresql MVCC, Read Committed Isolation Level and taking "snapshot"

From: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql MVCC, Read Committed Isolation Level and taking "snapshot"
Date: 2008-05-21 11:48:51
Message-ID: 20080521134851.65c5b114@dawn.webthatworks.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 20 May 2008 17:04:25 -0400
Andrew Sullivan <ajs(at)commandprompt(dot)com> wrote:

> On Tue, May 20, 2008 at 08:56:41PM +0200, Ivan Sergio Borgonovo
> wrote:
>
> > I just would like to have a coherent snapshot of some tables.
>
> If you have a multi-statement transaction, then if you are in READ
> COMMITTED you can see changes, and if you are in SERIALIZABLE you
> can't. You can't of course see changes "in the same statement" as
> it were -- that's not meaningful (because either they committed
> before you saw the row or after).

"read committed" is surely not enough and I'd like to avoid the side
effect of serializable that will force me to check if the
serializable transaction succeeded.

My target is writing something like:
- is the status I see "now" coherent with a set of rules?
* yes, save a "summary" into another table
* no, report an error to the client

Now if statements see changes I can't evaluate if the status is
coherent and/or I may save a summary that is different to the one I
checked.

It is not clear to me what does it mean:

http://www.postgresql.org/docs/8.1/interactive/transaction-iso.html

"Note that only updating transactions may need to be retried;
read-only transactions will never have serialization conflicts."

Then:
"because a serializable transaction cannot modify or lock rows
changed by other transactions after the serializable transaction
began."

Now, I'm not writing to the "source" tables, the one that have to be
checked for coherency. But I'm writing elsewhere.
I'm not concerned if what I save in the "summary" is different to
what I'll find in the DB once the "logging" function reach an end. I
just want to save a coherent status.

The logging/summary function will end in some
insert into logtable1 select ... from source1 join source2
insert into logtable2 select ... from source3 join source4

the above select will still have to see the same snapshot as when the
transaction started. Other updates to the source tables may go
further I'm not interested in locking them.

To my understanding if source and destination tables are different,
and target row of destination will have different pk/fk
serializable should provide what I need without being worried of the
ERROR: could not serialize access due to concurrent update

Since I'm not writing to the source tables I thought I could try the
"for share" path avoiding the "serializable".
In my situation "select for share" shouldn't miss rows if another
transaction update the source tables.
The selects are going to return the same set of rows... just with
different data inside.
If another transaction commit and they have to be re-evaluated this
should just mean they should still take a coherent snapshot... (?)

But then... that should mean that the overall function is going to be
re-evaluated.
eg.

select into _a a from source1 where condition for share.
if(_a is null) then

What's going to happen to this code if another transaction update the
rows involved?

Will the if condition be re-evaluated?

What is going to happen if I've something like

insert into target select from source for share?

If I've to make any difference between using "for share" and
serializable in my case I'd say:
- "for share" inside a transaction may miss "added" rows
- if there is another transaction updating the row interested into
"for share" the "logging" function will save an AFTER commit view
while serializable will save a BEFORE commit view and if something
get into the way it will abort.
- if writing into a different target with warranty of not overlapping
write I shouldn't incur in aborted transaction and make the
implementation easier and faster, since changed row won't force a
re-evaluation of the selects.

Explicit locking doesn't seem the way to go since I'm not writing to
the source table... so other transactions shouldn't be blocked from
writing to the source tables provided I can have a snapshot of the DB
at a certain time.

How does it sound?

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Fetter 2008-05-21 12:00:51 Re: Short-circuiting FK check for a newly-added field
Previous Message Maarten Deprez 2008-05-21 11:33:02 Re: escaping and quoting