Re: A third lock method

From: Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: A third lock method
Date: 2009-12-31 12:20:32
Message-ID: b0f3f5a10912310420h5bf988e1g941fdefa1e8a586a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/12/31 Bruce Momjian <bruce(at)momjian(dot)us>:

> I must be missing something but I thought the only problem with our
> existing snapshot system was that you could see a row updated after your
> snapshot was created, and that the solution to that was to abort the
> transaction that would see the new row.  Can you tell me what I am
> missing?

The problem is rather the opposite. A minimal example of a situation
that the current implementation allows, and which the new proposal
tries to fix is:

1. The database contains rows X and Y having one column, and having
different values for that column (i.e., X != Y).
2. "Serializable" (in the current PG sense) transactions A and B run
concurrently (i.e., both take their snapshot before the other commits,
so they don't see each other's changes).
3. Y := X; A reads X and updates Y to become the same as X.
4. X := Y; B reads Y and updates X to become the same as Y.

Result: Sequentially executing A and B in either order leads to a
result where X = Y. Still, after the above steps 1-4, the values of X
and Y are switched around (and thus X != Y). Therefore, the execution
was (by definition) not serializable. This is caused by the fact that
in a serializable execution either A would have seen the update
performed by B, or B would have seen the update performed by A. This
problem is called "write skew" in the paper (their example is less
theoretical, but also more complex because of the use of COUNT(..).)

So instead of aborting transactions "because otherwise they would see
too many changes", the goal is rather to abort transactions "because
otherwise they wouldn't have seen enough changes".

The SIREAD locks are used to mark "the versions that have been read by
whom" (for all transactions that were concurrent with any of the
active transactions), so that potentially problematic writes that
occur after reads can be detected: "I wrote a new version of something
that was already read by a concurrent transaction, so in any
serialization, I must come after that other transaction". The other
direction ("I read something that has a newer version than what I just
read, so in any serialization, I must come before that other
transaction") can be detected straightforwardly.

Nicolas

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-12-31 12:26:46 Re: KNNGiST for knn-search (WIP)
Previous Message Andres Freund 2009-12-31 12:14:07 Re: Hot Standy introduced problem with query cancel behavior