Re: WIP: Faster Expression Processing v4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Faster Expression Processing v4
Date: 2017-03-24 00:36:32
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I found a rather nasty bug :-( ... the comment in EEOP_INNER_VAR_FIRST about

+ /*
+ * Can't assert tts_nvalid, as wholerow var evaluation or such
+ * could have materialized the slot - but the contents are still
+ * valid :/
+ */
+ Assert(op->d.var.attnum >= 0);

is actually averting its eyes from a potentially serious problem. If we
go through ExecEvalWholeRowVar, which calls ExecFetchSlotTupleDatum, and
ExecFetchSlotTuple decides it has to materialize the tuple, then
ExecMaterializeSlot does this:

* Drop the pin on the referenced buffer, if there is one.
if (BufferIsValid(slot->tts_buffer))

slot->tts_buffer = InvalidBuffer;

* Mark extracted state invalid. This is important because the slot is
* not supposed to depend any more on the previous external data; we
* mustn't leave any dangling pass-by-reference datums in tts_values.
* However, we have not actually invalidated any such datums, if there
* happen to be any previously fetched from the slot. (Note in particular
* that we have not pfree'd tts_mintuple, if there is one.)
slot->tts_nvalid = 0;

The problem here is that once we drop the buffer pin, any pointers we may
have into on-disk data are dangling pointers --- we're at risk of some
other backend taking away that shared buffer. (So I'm afraid that the
commentary there is overly optimistic.) So even though those pointers
may still be there beyond tts_nvalid, subsequent references to them are
very dangerous.

I think that it's pretty hard to hit this in practice, maybe impossible,
because the normal case for an "on-disk" tuple is that
TTS_HAS_PHYSICAL_TUPLE is true, so that ExecFetchSlotTuple won't change
the state of the slot. If we have a virtual tuple that has to be
materialized, then by that very token it won't have a buffer pin to drop.
But I find this fragile as heck, and the aforesaid patch comment surely
isn't adequately documenting the safety considerations. Also, if there
ever were a live bug here, reproducing it would be damn hard because of
the low probability that a just-unpinned buffer would get replaced any
time soon. (Hm, I wonder whether the buffer cache needs something
analogous to the syscaches' CLOBBER_CACHE_ALWAYS behavior...)

Besides which, I really really don't like the lack of an "attnum <
tts_nvalid" assertion there; that's just obviously failing to check for
very simple bugs, such as getting the FETCHSOME steps wrong.

So I think that we have got to fix ExecEvalWholeRowVar so that it doesn't
clobber the state of the slot. Right at the moment, the only way to do
that seems to be to do this instead of ExecFetchSlotTupleDatum:

tuple = ExecCopySlotTuple(slot);
dtuple = (HeapTupleHeader)

That's kind of annoying because of the double copying involved, but
tuptoaster.c doesn't expose any functionality for this except
heap_copy_tuple_as_datum(). I figure we can improve it later --- it looks
like we can refactor heap_copy_tuple_as_datum to expose a function that
reads from Datum/isnull arrays, and then call it on the slot's
tts_values/tts_isnull arrays. Seems like it might be a good idea to
think a bit harder about the TupleTableSlot APIs, too, and reduce the
number of cases where execTuples exposes destructive changes to the
state of a slot.

Also, while trying to test the above scenario, I realized that the patch
as submitted was being way too cavalier about where it was applying
CheckVarSlotCompatibility and so on. The ASSIGN_FOO_VAR steps, for
instance, had no protection at all. Think I have that all fixed up

I hope to have a fully reviewed patch to pass back to you tomorrow.
Or Saturday at the latest.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-03-24 00:43:23 Re: WIP: Faster Expression Processing v4
Previous Message Michael Paquier 2017-03-24 00:21:40 Re: Backend crash on non-exclusive backup cancel