On 16 January 2012 21:30, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Useful, yes. Harder than it looks, probably. I tried to mock up a
> version of this years ago for a project where I needed it, and ran into
> all kinds of race conditions.
Can you remember any details about those race conditions?
> Anyway, if it could be made to work, this is extremely useful for any
> application which needs queueing behavior (with is a large plurality, if
> not a majority, of applications).
Ok, based on this feedback I decided to push further and try
implementating this. See POC/WIP patch attached. It seems to work
for simple examples but I haven't yet tried to break it or see how it
interacts with more complicated queries or high concurrency levels.
It probably contains at least a few rookie mistakes! Any feedback
The approach is described in my original email. Short version:
heap_lock_tuple now takes an enum called wait_policy instead of a
boolean called nowait, with the following values:
LockWaitBlock: wait for lock (like nowait = false before),
LockWaitError: error if not immediately lockable (like nowait = true
LockWaitSkip: give up and return HeapTupleWouldBlock if not
immediately lockable (this is a new policy)
The rest of the patch is about getting the appropriate value down to
that function call, following the example of the existing nowait
support, and skipping rows if you said SKIP LOCKED DATA and you got
Compared to one very popular commercial database's implementation, I
think this is a little bit friendlier for the user who wants to
distribute work. Let's say you want to lock one row without lock
contention, which this patch allows with FETCH FIRST 1 ROW ONLY FOR
UPDATE SKIP LOCKED DATA in an SQL query. In that other system, the
mechanism for limiting the number of rows fetched is done in the WHERE
clause, and therefore the N rows are counted *before* checking if the
lock can be obtained, so users sometimes have to resort to stored
procedures so they can control the FETCH from a cursor imperatively.
In another popular commercial database from Redmond, you can ask for
the top (first) N rows while using the equivalent of SKIP LOCKED DATA
and it has the same effect as this patch as far as I can tell, and
another large blue system is the same.
As discussed in another branch of this thread, you can probably get
the same effect with transactional advisory locks. But I personally
like row skipping better as an explicit feature because:
(1) I think there might be an order-of-evaluation problem with a WHERE
clause containing both lock testing and row filtering expressions (ie
it is undefined right?) which you might need subselects to work around
(ie to be sure to avoid false positive locks, not sure about this).
(2) The advisory technique requires you to introduce an integer
identifier if you didn't already have one and then lock that despite
having already said which rows you want to try to lock in standard row
(3) The advisory technique requires all users of the table to
participate in the advisory lock protocol even if they don't want to
use the option to skip lock.
(4) It complements NOWAIT (which could also have been done with
transactional advisory locks, with a function like try_lock_or_fail).
(5) I like the idea of directly porting applications from other
databases with this feature (that is what led me here).
I can also imagine some other arguments against SKIP LOCKED DATA: "the
type of applications that would use this technique generate too much
dead tuple churn for PostgreSQL anyway" (for example, compared to the
update-tuples-in-place systems like DB2 z/OS edition where SKIP LOCKED
DATA is used to distribute work efficiently), and "databases shouldn't
be used as queues anyway, for that we have $X", and "skipping rows
provides an inconsistent view of the data" (ie a result set that never
Here are some examples of previous requests or discussion of this
Thanks for reading!
In response to
pgsql-hackers by date
|Next:||From: Hitoshi Harada||Date: 2012-02-04 13:01:41|
|Subject: Re: Patch: Allow SQL-language functions to reference
parameters by parameter name|
|Previous:||From: Marti Raudsepp||Date: 2012-02-04 10:40:59|
|Subject: Re: Caching for stable expressions with constant arguments v6|