Re: Referential Integrity and SHARE locks

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Richard Huxton <dev(at)archonet(dot)com>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Referential Integrity and SHARE locks
Date: 2007-02-03 01:37:09
Message-ID: 45C3E745.5000006@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> My earlier thinking was that Oracle appears to be able to avoid locking
> and my thought was that this was simply a rather dodgy interpretation of
> the SQL Standard. Anyway, I'm not happy with simply forgetting the SHARE
> lock; that clearly leads to invalid states in some cases, even if I need
> to have a strong cup of coffee in the morning before I see them.
I think oracle is in a completly different situation here - Oracle imposes
limits on the maximum "size" of a transaction dues to various reasons
I believe - one being the size of the rollback segment. AFAIK, postgres
doesn't impose any such limits (apart from problems with long-running
transactions an vacuums, and possibly if you do "set constraints all deferred")
- which is why row locks have to be stored on-disk in the tuple header,
and not in some shared-memory segment.

Now, _if_ you're already imposing limits on transaction size, than it becomes
quite feasable IMHO to also limit the number of row-locks a transaction can take -
and to just store them in memory. This again makes column-level locking much
easier I'd think.

> Using SELECT ... FOR SHARE in RI checks is better than using FOR UPDATE,
> but its still too heavy a lock for many scenarios.
I think it's not too heavy, but it's actually the wrong kind of lock for ri
checks. Both "SHARE" and "EXCLUSIVE" row locks are, well, _row_ locks - the
lock a specific tuple. This is fine, if you want to guarantee that
a certain tuple stays as it is as long as still need it. But it's not really
what a RI constraints wants to ensure. An RI constraint actually wants to
force a specific _condition_ (namely, the existence of a row with a certain
value in a certain column) to be true, not prevent a specific physical tuple
from being modifier.

Now, a generic mechanism for "condition locking" is probably impossible to
implement with large performance sacrifices - but AFAICS the cases needed for
RI checks are always of the form "Is there a row where (field1, field2, ...) =
(a, b, c, ...)". And - at least for RI-Checks done when updating the _referencing_
table, postgres already forces an index to exist on (field1, field2, ...) I think.

The condition "There is a row where (field1, field2, ...) = (a,b,c,...)" is
the same as saying "There as an index entry for (a,b,c,....) that points to a live row".
_If_ is was possible to somehow take a lock on a index key (not a certain index entry,
but rather _all_ entries with a given key), than that could maybe be used for more
efficient RI locks. I guess this would need some sort of tuple-visibility-in-index entries,
but it seems that there a few people interested in making this happen.

greetings, Florian Pflug

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-02-03 01:56:31 Re: writing new regexp functions
Previous Message Jeremy Drake 2007-02-03 00:59:54 Re: writing new regexp functions