Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Ondřej Jirman <ienieghapheoghaiwida(at)xff(dot)cz>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Date: 2019-11-21 22:37:25
Message-ID: 13122.1574375845@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> FWIW my hunch is the bug is somewhere in this chunk of code from
> apply_heap_update:

> oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
> ExecCopySlot(remoteslot, localslot);
> slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed);
> MemoryContextSwitchTo(oldctx);

My first reaction to looking at this was that ExecCopySlot() could
be dropped entirely, because the first thing that slot_modify_cstrings
does is

slot_getallattrs(slot);
ExecClearTuple(slot);

That slot_getallattrs call seems 100% useless as well, because
surely ExecClearTuple is just going to drop all the data.

On closer inspection, though, it looks like the author of this
code is relying on ExecClearTuple to *not* destroy the data,
because the subsequent loop might overwrite only some of the
slot's columns, before it does

ExecStoreVirtualTuple(slot);

which supposes that all the columns are valid.

But of course this is 100% broken, because whatever ExecClearTuple
may have done or not done to the slot's datum/isnull arrays, it
certainly pfree'd the slot's physical tuple. So any pass-by-ref
datums are now dangling pointers.

I imagine the only reason this code has gotten past the valgrind
animals is that we're not testing cases where non-replaced columns
in the subscriber table are of pass-by-ref types.

I draw a direct line between the existence of this bug and the lack
of commentary in slot_modify_cstrings about what it's doing. If the
author had troubled to write a comment explaining these machinations,
he might've noticed the bug, or at least reviewers might have.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-11-21 23:34:18 Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Previous Message Tomas Vondra 2019-11-21 22:08:54 Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker