Re: serializable lock consistency

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: serializable lock consistency
Date: 2010-12-20 22:32:46
Message-ID: F6F43EA1-C3A1-410F-8CAA-837BEC767E1A@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Dec20, 2010, at 18:54 , Robert Haas wrote:
> On Mon, Dec 20, 2010 at 12:49 PM, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>> For me, this is another very good reason to explore this further. Plus, it
>> improves the ratio of grotty-ness vs. number-of-problems-soved ;-)
>
> By all means, look into it further. I fear the boat is filling up
> with water, but if you manage to come up with a workable solution I'll
> be as happy as anyone, promise!

I'll try to create a details proposal. To do that, however, I'll require
some guidance on whats acceptable and whats not.

Here's a summary of the preceding discussion

To deal with aborted transactions correctly, we need to track the last
locker of a particular tuple that actually committed. If we also want
to fix the bug that causes a row lock to be lost upon doing
lock;savepoint;update;restore that "latest committed locker" will
sometimes need to be a set, since it'll need to store the outer
transaction's xid as well as the latest actually committed locker.

As long as no transaction aborts are involved, the tuple's xmax
contains all the information we need. If a transaction updates,
deletes or locks a row, the previous xmax is overwritten. If the
transaction later aborts, we cannot decide whether it has previously
been locked or not.

And these ideas have come up

A) Transactions who merely lock a row could put the previous
locker's xid (if >= GlobalXmin) *and* their own xid into a multi-xid,
and store that in xmax. For shared locks, this merely means cleaning
out the existing multi-xid a bit less aggressively. There's
no risk of bloat there, since we only need to keep one committed
xid, not all of them. For exclusive locks, we currently never
create a multi-xid. That'd change, we'd need to create one
if we find a previous locker with an xid >= GlobalXmin. This doesn't
solve the UPDATE and DELETE cases. For SELECT-FOR-SHARE this
is probably the best option, since it comes very close to what
we do currently.

B) A transaction who UPDATEs or DELETEs a tuple could create an
intermediate lock-only tuple which'd contain the necessary
information about previous lock holders. We'd only need to do
that if there actually is one with xid >= GlobalXmin. We could
then choose whether to do the same for SELECT-FOR-UPDATE, or
whether we'd prefer to go with (A)

C) The ctid field is only necessary for updated tuples. We could thus
overlay it with a field which stores the last committed locker after
a DELETE. UPDATEs could be handled either as in (B), or by storing the
information in the ctid-overlay in the *new* tuple. SELECT-FOR-UPDATE
could again either also use the ctid overlay or use (A).

D) We could add a new tuple header field xlatest. To support binary
upgrade, we'd need to be able to read tuples without that field
also. We could then either create a new tuple version upon the
first lock request to such a tuple (which would then include the
new header), or we could simply raise a serialization error if
a serializable transaction tried to update a tuple without the
field whose xmax was aborted and >= GlobalXmin.

I have the nagging feeling that (D) will meet quite some resistance. (C) was
too well received either, though I wonder if that'd change if the grotty-ness
was hidden behind a API, much xvac/cmin/cmax overlay is. (B) seems like a
lot of overhead, but maybe cleaner. More research is needed though to check
how it'd interact with HOT and how to get the locking right. (A) is IMHO the
best solution for the SELECT-FOR-SHARE since it's very close to what we do
today.

Any comments? Especially of the "don't you dare" kind?

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2010-12-20 22:38:52 Re: Extensions, patch 22 (cleanup, review, cleanup)
Previous Message Erik Rijkers 2010-12-20 22:06:08 Re: Extensions, patch 22 (cleanup, review, cleanup)