Re: UPDATE grabs multiple rows when it seems like it should only grab one

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Kevin Burke <burke(at)shyp(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: UPDATE grabs multiple rows when it seems like it should only grab one
Date: 2016-04-23 15:31:37
Message-ID: 2997.1461425497@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> The understanding of JOIN that I hold is to take two complete relations
> and combine them on some predicate. The from relation here, when complete,
> only has one row and given it is effectively a self-join on the PK the
> result of the join is guaranteed to be a single row. I do not follow how
> the sub-select is allowed to be evaluated multiple times.

You're not perceiving it correctly. In general, a join is defined as
"evaluate the join condition for each row pair in the cross-product
of the two relations, and take all the row pairs for which the join
condition succeeds". The conditions in the query are effectively all
join conditions from a semantic standpoint; the fact that some are
syntactically written as a sub-select doesn't change that. Kevin's query
will work as he expects only if the join condition is stable --- but here
it's volatile in two different ways, one being the underspecified LIMIT,
and the other one being the fact that FOR UPDATE allows the status
condition to see the effects of the outer UPDATE.

What's needed to make this work is to have a hard optimization boundary
between the sub-select and the outer query, so that these conditions don't
act like join conditions. Merely writing sub-select syntax doesn't
provide that (and users wouldn't like it if it did). I think that a CTE
should be a sufficient boundary, although even with a CTE, the FOR UPDATE
is a bit risky: if the outer update could change a row before the CTE's
scan had reached it, you'd still get funny results. Should be OK in this
usage though, because you'd never update a row that the CTE hadn't already
examined.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Noah Misch 2016-04-24 18:15:17 Re: BUG #14081: System LC_COLLATE changed
Previous Message Mathias Kunter 2016-04-23 15:08:16 Re: Re: BUG #14107: Major query planner bug regarding subqueries and indices