Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Richard Guo <guofenglinux(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date: 2024-01-13 23:29:15
Message-ID: 1110148.1705188555@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Do you want to take this forward from here?

> Happy to do that, if no one objects to this way forward.

OK, I think I've finally sussed the reason why 86dc90056 created
this problem. Sure, we made a virtual tuple before that (in the
ExecFilterJunk call), but all its by-ref values were referencing the
epqslot_candidate tuple, that is the output of evaluating the EPQ
plan. It was that plan tree's responsibility to produce a valid
tuple, and it did --- there might be pointers to disk buffers in that
virtual tuple, but the EPQ planstate tree would be holding the pins
required to make it legal.

But now, the UPDATE plan tree only delivers changed columns, and the
EPQ tree is the same. So we build a virtual tuple whose not-changed
columns are referencing the "oldslot" on-disk tuple just fetched by
GetTupleForTrigger, and there isn't (or might not be) any pin
protecting those values except oldslot's.

There's one more angle that I'd not understood. We *do* still hold
a pin on the original target tuple --- that's owned by the oldSlot
back in ExecModifyTable, that is resultRelInfo->ri_oldTupleSlot.
That's why there's no problem with the initial newslot, even if
it's virtual. This bug can only manifest when GetTupleForTrigger
locates an updated tuple version that is in a different page.
Then, oldslot's pin is the only one protecting our access to that
tuple.

Also, I realized that my concerns about a vast state space for
execution of the triggers were unfounded. That's because each
time we're about to call a trigger, we'll do

if (!newtuple)
newtuple = ExecFetchSlotHeapTuple(newslot, true, &should_free_new);

which causes newslot to become materialized if it wasn't already.
And of course we've forced oldslot into a materialized state.
So the triggers will never see anything but two materialized slots.

So I'm now convinced that all we need is an ExecMaterializeSlot
call in the EPQ path, much as in Alexander's first patch.

I don't think that the changes in the v4 patch are improvements.
I really do not like

+ newslot = ExecGetUpdateNewTuple(relinfo, epqslot_candidate,
+ oldslot);

and here's why: it is critical that we update the caller's slot,
because that's the only way trigger-driven changes will get back to
ExecModifyTable. This coding obscures what's happening, and would
break badly if ExecGetUpdateNewTuple ever acted differently than it
does now or if a caller passed a slot that wasn't the same slot (which
I'm not convinced never happens; see ExecSimpleRelationUpdate for what
looks like a counterexample). And it's not even saving anything
meaningful. So I want to keep the ExecCopySlot call, although I think
it'd be sensible to do

- if (newslot != epqslot_clean)
+ if (unlikely(newslot != epqslot_clean))

for whatever tiny benefit we can get there.

Also, on looking closer, this change doesn't entitle us to revert
75e03eabe. That's guarding against a dangling-pointer situation
that could be created later, not the one we're fixing here.

So, after a bit more work on the comments, I end with the attached.

BTW, I noticed that ExecUpdatePrologue does

ExecMaterializeSlot(slot);

without any comment as to why it's necessary ... and I rather bet
it isn't. But I'm not going to experiment with changing that in
a bug-fix patch.

regards, tom lane

Attachment Content-Type Size
v5-0001-Fix-potential-use-after-free-for-BEFORE-ROW-UPDATE-t.patch text/x-diff 2.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2024-01-14 00:55:03 Re: BUG #18240: Undefined behaviour in cash_mul_flt8() and friends
Previous Message Thomas Munro 2024-01-13 04:33:50 Re: BUG #18289: postgresql14-devel-14.10-2PGDG.rhel8.x86_64.rpm Contains invalid cLang option in Makefile.global