Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order
Date: 2009-10-27 15:22:48
Message-ID: 14029.1256656968@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Could the desired behavior be obtained using a CTE?

>> Nope, we push FOR UPDATE into WITHs too. I don't really see any way to
>> deal with this without some sort of semantic changes.

> ... although on reflection, I'm not sure *why* we push FOR UPDATE into
> WITHs. That seems a bit antithetical to the position we've evolved that
> WITH queries are executed independently of the outer query.

> If we removed that bit of behavior, which hopefully is too new for much
> code to depend on, then the old FOR-UPDATE-last behavior could be
> attained via a WITH. And we'd not have to risk touching the interaction
> between plain subqueries and FOR UPDATE, which is something that seems
> much more likely to break existing apps.

On further investigation, this still seems like a good change to make,
but it doesn't solve the problem at hand. If we make FOR UPDATE not
propagate into WITH then the effect of

with w as (select ... order by x limit n) select * from w for update

is not going to be to lock just the rows pulled from the WITH; it's
going to be to not lock any rows at all. So we're still up against a
performance problem if we make these otherwise-desirable changes in plan
node order.

I realized just now that the backwards-compatibility problem is not
nearly as bad as I thought it was, because a lot of the syntaxes
we might want to change the behavior of will fail outright in 8.4
and before. We had this little bit in preptlist.c:

/*
* Currently the executor only supports FOR UPDATE/SHARE at top level
*/
if (root->query_level > 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("SELECT FOR UPDATE/SHARE is not allowed in subqueries")));

and that is the error you will get if you have FOR UPDATE in or applying
to a sub-select that does not get flattened into the calling query.
So in particular, a sub-select using ORDER BY or LIMIT has never been
compatible with FOR UPDATE at all, and that means we can define the
behavior of that combination freely.

What I am thinking we should do is define that FOR UPDATE happens before
ORDER BY or LIMIT normally, but that if the FOR UPDATE is inherited from
an outer query level, it happens after the sub-select's ORDER BY or
LIMIT. The first provision fixes the bugs noted in our documentation,
and the second one allows people to get back the old behavior if they
need it for performance. This also seems reasonably non-astonishing
from a semantic viewpoint.

Actually implementing this will be more than a one-line change, but it
doesn't seem too terribly difficult --- we'll just need to modify the
query representation of FOR UPDATE enough that we can remember which
case we had.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2009-10-27 16:13:26 Re: Scaling up deferred unique checks and the after trigger queue
Previous Message Euler Taveira de Oliveira 2009-10-27 15:16:55 Re: per-tablespace random_page_cost/seq_page_cost