Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order
Date: 2009-10-27 21:05:45
Message-ID: 6197.1256677545@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Oct 27, 2009 at 1:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Right, the case would be something like

>>> select * from
>>> (select * from foo order by x limit n) ss
>>> for update of ss;

>> That's a pretty odd construction.

> Dunno why you think that. That's exactly what one would write if one
> wanted certain operations to execute in a different order than they're
> defined to execute in within a single query level. We have not
> previously been very clear about the order of operations for FOR UPDATE
> locking relative to other steps, but now we will be.

Actually ... it strikes me that there is another way we could approach
this. Namely, leave the semantics as-is (FOR UPDATE runs last) and
document that you can do

select * from
(select * from foo for update) ss
order by x limit n;

if you need FOR UPDATE to run before sorting. Or perhaps better,
redefine the ordering as ORDER BY then FOR UPDATE then LIMIT. Swapping
FOR UPDATE and LIMIT has no performance cost and eliminates the worse of
the two complaints in the documentation, without breaking any working
queries AFAICS. If you have the case where you want to cope with
concurrent updates to the sort key, then you can write the more
complicated query, and it's gonna cost ya. But that's not a typical
usage, as proven by the fact that it took years to realize there was
a problem there. So we shouldn't optimize for that usage at the expense
of cases where the sort key isn't expected to change.

It could be argued that this approach doesn't satisfy the principle of
least astonishment as well as doing FOR UPDATE first, but on reflection
I'm not sure I buy that. The traditional definition has been that we
only lock the rows that are actually returned, and putting FOR UPDATE
underneath the sort will break that expectation. If it's only underneath
LIMIT we can still meet that expectation.

So I'm liking this more the more I think about it ... and it's also
significantly less work than the other way would be.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-10-27 21:09:21 Re: Extraneous newlines in logfile from vacuumdb
Previous Message Greg Smith 2009-10-27 20:50:26 Re: Parsing config files in a directory