Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Stanislav Grozev <tacho(at)daemonz(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.
Date: 2015-12-08 14:44:40
Message-ID: 20151208144440.GU4934@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2015-12-03 17:34:12 -0800, Peter Geoghegan wrote:
> On Thu, Dec 3, 2015 at 1:10 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > I'll need to think about a fix.
>
> The problem was with the pointer we pass to ExecUpdate().
>
> It's a pointer to the target tuple in shared memory. So the field
> "tuple.t_data->t_ctid" within ExecOnConflictUpdate() starts out
> pointing to an ItemPointerData with the correct ctid (when it
> initially points to the current/target tuple, since as an
> about-to-be-upserted tuple the "t_ctid" field must be a pointer to the
> self-same tuple). Then, it is modified in-place in shared memory by
> heap_update(), within its critical section.

Hm. But why it correct to use t_ctid in the first place? I mean if there
was a previously-failed UPDATE t_ctid will point somewhere meaningless?
Shouldn't we use tuple->t_self, or conflictTid here?

Doing that reveals one change in the regression tests. Namely
-- This shows an attempt to update an invisible row, which should really be
-- reported as a cardinality violation, but it doesn't seem worth fixing:
WITH simpletup AS (
SELECT 2 k, 'Green' v),
upsert_cte AS (
INSERT INTO z VALUES(2, 'Blue') ON CONFLICT (k) DO
UPDATE SET (k, v) = (SELECT k, v FROM simpletup WHERE simpletup.k = z.k)
RETURNING k, v)
INSERT INTO z VALUES(2, 'Red') ON CONFLICT (k) DO
UPDATE SET (k, v) = (SELECT k, v FROM upsert_cte WHERE upsert_cte.k = z.k)
RETURNING k, v;
out of with.sql doesn't fail anymore, and instead returns no rows.

At that point in the regression tests there's a conflicting tuple for
'2'. Rewriting the statement to

WITH simpletup AS (
SELECT 2 k, 'Green' v),
upsert_cte AS (
UPDATE z SET (k, v) = (SELECT k, v FROM simpletup WHERE simpletup.k = z.k)
WHERE k = 2
RETURNING k, v)
UPDATE z SET (k, v) = (SELECT k, v FROM upsert_cte WHERE upsert_cte.k = z.k)
WHERE k = 2
RETURNING k, v;

does *not* error out. That's because it hits the HeapTupleSelfUpdated
block in ExecUpdate(). So, working as designed.

The reason the upsert variant gets an error with master/your patch is
solely that t_ctid already points to the new version of the tuple -
which surely is wrong! t_ctid could point to nearly arbitrary things
here (if the previous target for t_ctid was hot pruned and then replaced
with new contents), unless I miss something.

It sounds to me like the real fix would be to just use
&tuple.t_self. That'll "break" the above regression test. But the reason
for that seems to be entirely independent: Namely that in this case the
tuple is only modified after the heap_lock_tuple(), in the course of the
ExecProject() computing the new tuple version - the SELECT FROM
upsert_cte...

Nasty.

ISTM, that if we really want to protect against UpdatedSelf we need to
to do so after ExecQual(), ExecWCE(), and ExecProject(). Each of those
can trigger such an update.

Am I missing something major here?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kevin Grittner 2015-12-08 16:08:34 Re: BUG #13798: Unexpected multiple exection of user defined function with out parameters
Previous Message Shulgin, Oleksandr 2015-12-08 10:54:50 Re: BUG #13799: Unexpected multiple exection of user defined function with out parameters