Re: BUG #16036: Segmentation fault while doing an update

From: Andres Freund <andres(at)anarazel(dot)de>
To: Антон Власов <druidvav(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>
Subject: Re: BUG #16036: Segmentation fault while doing an update
Date: 2019-10-04 03:20:20
Message-ID: 20191004032020.se46diwqas43e3tw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2019-10-04 05:11:16 +0300, Антон Власов wrote:
> Reproducible schema:
>
> create table test_table (
> id serial, field int default 0 not null
> );
>
> create or replace function test_table_bu() returns trigger
> language plpgsql
> as
> $$
> begin
> return new;
> end;
> $$;
>
> create trigger test_table_bu
> before update
> on test_table
> for each row
> execute procedure test_table_bu();
>
> insert into test_table (field) values (0);
> test.sh:
>
> psql -d gdeposylka -c "update test_table set field = field + 1 where id = 1;" &
> psql -d gdeposylka -c "update test_table set field = field + 1 where id = 1;" &

Thanks, that's very helpful!

I've started to narrow down the problem:

bool
ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
ResultRelInfo *relinfo,
ItemPointer tupleid,
HeapTuple fdw_trigtuple,
TupleTableSlot *newslot)
{
...

/*
* In READ COMMITTED isolation level it's possible that target tuple
* was changed due to concurrent update. In that case we have a raw
* subplan output tuple in newSlot, and need to run it through the
* junk filter to produce an insertable tuple.
*
* Caution: more than likely, the passed-in slot is the same as the
* junkfilter's output slot, so we are clobbering the original value
* of slottuple by doing the filtering. This is OK since neither we
* nor our caller have any more interest in the prior contents of that
* slot.
*/
if (newSlot != NULL)
{
TupleTableSlot *slot = ExecFilterJunk(relinfo->ri_junkFilter, newSlot);

ExecCopySlot(newslot, slot);
}

the problem is that 'newslot' and 'slot' are the same. As copying first
clears the target slot, that results in tuple slot in a bogus state, with
BufferHeapTupleTableSlot->base.tuple = NULL, but TTS_FLAG_EMPTY not set,
but TTS_FLAG_SHOULDFREE set.

The only reason that doesn't run into the assert in
ExecCopySlotHeapTuple() that the slot bet not be empty, is that we've
already cleared that on the destination slot (which is also the source
slot) at that point...

dstslot->tts_flags &= ~TTS_FLAG_EMPTY;
oldContext = MemoryContextSwitchTo(dstslot->tts_mcxt);
bdstslot->base.tuple = ExecCopySlotHeapTuple(srcslot);
MemoryContextSwitchTo(oldContext);

So I think what we need to do is:
1) only call ExecCopySlot() if slot != newslot - this fixes the crash
2) Add an assert to ExecCopySlot() ensuring source and target slot are
distinct
3) Figure out why our tests didn't catch this, this afaict should be a
tested scenario
4) Fix confusing variable naming around newSlot/newslot in ExecBRUpdateTriggers

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-10-04 03:26:01 Re: BUG #16036: Segmentation fault while doing an update
Previous Message Антон Власов 2019-10-04 02:11:16 Re: BUG #16036: Segmentation fault while doing an update