Re: WIP fix proposal for bug #6123

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP fix proposal for bug #6123
Date: 2011-08-03 08:50:07
Message-ID: AF63FD94-7275-4216-9A68-016B47F7B914@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Aug3, 2011, at 04:54 , Robert Haas wrote:
> On Tue, Aug 2, 2011 at 2:32 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>>>> Would you feel comfortable with a patch which threw an error on
>>>> the DELETE case, as it does on the UPDATE case?
>>>
>>> Yes, though with a little additional twist. The twist being that
>>> I'd like the error to be thrown earlier, at the time of the
>>> offending UPDATE or DELETE, not later, when the original UPDATE or
>>> DELETE which caused the BEFORE trigger invocation stumbles over
>>> the updated row. It not only seems more logical that way, but it
>>> also makes it possible for the trigger to catch the error, and
>>> react accordingly.
>>
>> I hadn't thought of that. It does seem better in every way except
>> for how much work it takes to do it and how much testing it needs to
>> have confidence in it. Definitely not 9.1 material.
>
> I'm not sure I understand the justification for throwing an error.
> Updating a row twice is not an error under any other circumstances;
> nor is deleting a row you've updated. Why should it be here?

Because updating or deleting a row from within a BEFORE trigger fired
upon updating or deleting that very row causes two intermingled
UPDATE/DELETE executions, not one UPDATE/DELETE followed by another.

Here's a step-by-step description of what happens in the case of two
UPDATES, assuming that we don't ignore any updated and throw no error.

1) UPDATE T SET ... WHERE ID=1
2) Row with ID=1 is found & locked
3) BEFORE UPDATE triggers are fired, with OLD containing the original
row's contents and NEW the values specified in (1)
4) UPDATE T SET ... WHERE ID=1, executed from within one of the triggers
5) BEFORE UPDATE triggers are fired again, with OLD containing the
original row's contents and NEW the value specified in (4).
6) The row is updated with the values specified in (4), possibly changed
by the triggers in (5)
7) The row is updated with the values specified in (2), possibly changed
by the triggers in (3).

Now, in step (7), we're overwriting the UPDATE from steps 4-6 without
even looking at the row that UPDATE produced. If, for example, both UPDATES
do "counter = counter + 1", we'd end up with the counter incremented by
1, *not* by 2, even though there were two distinct UPDATEs. Or even worse,
the inner UPDATE at (4) might have modified the ID. Then, we'll apply the
outer UPDATE in (7), even though the original WHERE condition from (1)
no longer matches the row.

We could attempt to fix that by using the EvalPlanQual machinery to
re-check the WHERE clause and re-compute the new row in (7). However,
EvalPlanQual introduces a lot of conceptional complexity, and can lead
to very surprising results for more complex UPDATES. (There's that whole
self-join problem with EvalPlanQual, for example)

Also, even if we did use EvalPlanQual, we'd still have executed the outer
BEFORE trigger (step 3) with values for OLD and NEW which *don't* match
the row the we actually update in (7). Doing that seems rather hazardous
too - who knows what the BEFORE trigger has used the values for.

The different between Kevin's original patch and my suggestion is, BTW,
that he made step (7) through an error while I suggested the error to
be thrown already at (4).

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2011-08-03 09:25:31 Re: Further news on Clang - spurious warnings
Previous Message Peter Geoghegan 2011-08-03 08:47:07 Re: Further news on Clang - spurious warnings