Re: Is FOR UPDATE an optimization fence?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is FOR UPDATE an optimization fence?
Date: 2009-10-11 19:17:57
Message-ID: 603c8f070910111217j23932139wa5d121fd535837d8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 11, 2009 at 12:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm fooling around with pushing FOR UPDATE locking into a new plan node
> type, and I just noticed a behavior that seems a bit bogus.
> Historically we have dealt with FOR UPDATE in sub-selects by flattening
> the sub-select if we could, because the alternative was to fail
> altogether.  For example, consider
>
>        select * from a join (select * from b for update) ss on a.x = ss.y;
>
> The FOR UPDATE effectively got hoisted to the top because that's where
> we could implement it, making this equivalent to
>
>        select * from a join b on a.x = b.y for update of b;
>
> It seems to me, though, that this is changing the semantics.  In the
> latter case it's clear that we should only lock b rows that have a join
> partner in a (which indeed is what happens).  In the former case, what
> I think should be expected to happen is that *all* b rows get locked.
>
> With FOR UPDATE as a plan node, it's possible to fix this by treating
> FOR UPDATE in a sub-select as an optimization fence that prevents
> flattening of the sub-select, much like LIMIT has always done.  The
> FOR UPDATE node will end up at the top of the subplan and it will act
> as the syntax would suggest.
>
> Of course the downside of changing it is that queries that worked fine
> before might work differently (and much slower) now; first because not
> flattening the sub-select might lead to a worse plan, and second because
> locking more rows takes more time.
>
> The alternative would be to let it continue to flatten such sub-selects
> when possible, and to tell anyone who doesn't want that to stick in
> OFFSET 0 as an optimization fence.
>
> It's an entirely trivial code change either way.  I'm inclined to think
> that we should prevent flattening, on the grounds of least astonishment.

It seems like this is somewhat related to the question of embedding an
{INSERT|UPDATE|DELETE}...RETURNING in some arbitrary part of a query
versus only allowing it in a WITH clause. The argument for only
allowing it in a WITH clause is that there is otherwise no guarantee
that it is evaluated in its entirety but just once. ISTM we could
contrariwise give it the handling you're proposing here: allow it
anywhere in the query, but make it act as an optimization fence.

For that reason, I think I'd be inclined to make it act as an
optimization fence if used as a top-level CTE, but otherwise flatten
it, so that the handling is consistent with what we've proposed to do
elsewhere. But I'm not really familiar with how these constructs are
being treated by the executor, so I might be creating a false parallel
here.

The other comment I have is that I *expect* subqueries to be pulled
up. So my own personal POLA would not be violated by locking only the
rows with a join partner; in fact it would be more likely to be
violated by the reverse behavior. I might not be typical, though. My
experience is that not pulling up subqueries tends to have disastrous
effects on performance, so I'm somewhat biased against creating more
situations where that will happen.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Chernow 2009-10-11 19:32:08 Re: man pages
Previous Message David Fetter 2009-10-11 19:10:38 man pages