Re: TupleTableSlot abstraction

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: TupleTableSlot abstraction
Date: 2018-07-05 10:37:04
Message-ID: CAFjFpRc0yujvShvcU0XiLDE=TQzycVhVidVJPqm_LDnb8dhRNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 11, 2018 at 6:13 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>
>>
>> The slightly bigger issue is that several parts of the code currently
>> require heaptuples they can scribble upon. E.g. nodeModifyTable.c
>> materializes the tuple - those can be solved by using a local slot to
>> store the tuple, rather than using the incoming one. In nearly all cases
>> we'd still be left with the same number of tuple copy steps, as
>> materializing the tuple with copy into the local storage anyway. A few
>> other cases, e.g. intorel_receive(), just can use ExecCopySlotTuple() or
>> such instead of materialize.
>
> There is a patch changing callers of ExecMaterializeSlot() to use
> ExecGetHeapTupleFromSlot() or ExecSaveSlot(), new functions added by
> me. But I think that needs more work. I don't think the new functions
> are necessary. We could use ExecCopySlotTuple(), ExecFetchSlotTuple()
> in place of ExecGetHeapTupleFromSlot() appropriately. The callers who
> just want the tuple and not necessarily a writable one, can use
> ExecFetchSlotTuple(). Those who want a writable copy can use
> ExecCopySlotTuple(). When the slot is *not* heap or buffer heap tuple
> table slot, we materialize a heap tuple every time we want to get the
> heap tuple out of the slot. If we are doing this multiple times
> without changing the contents of the slot, we will leak memory. I have
> TODOs to check whether that's a problem in practice. May be we should
> use ExecCopySlotTuple() explicitly while fetching tuples from those
> kinds of slot. Same is case while fetching a minimal heap tuple from
> non minimal heap tuple slots.

In the attached set of patches, the ExecMaterializeSlot(),
ExecFetchSlotTuple() and ExecFetchSlotMinimalTuple() have been
changed.

Here's excerpt of the commit messages included in 0006 in the attached
patch-set.
---
Before this work, the contents of the slot could be in virtual form
(in tts_datums and tts_isnull) and/or be stored as a heap tuple and/or
a minimal tuple. When the contents were in heap or minimal tuple
form, those tuples need not be "owned" by the slot. When a tuple is
owned by a slot, it's slot's responsibility to free the memory
consumed by the tuple when the tuple is not needed. That typically
happened when the slot got a new tuple to store in it or when the slot
itself was discarded, mostly along-with the other slots in tuple
table.

Then ExecMaterializeSlot() served two purposes: 1. If the contents of
the slot were not "owned" by the slot, or if there was no heap tuple
in the slot, it created a heap tuple which it "owned" from those
contents. 2. It returned the heap tuple, which the slot "owned". This
was typically used when the caller wanted, from the slot, a heap tuple
which it can upon (esp. system attributes like tableOid) and expected
it to be visible from within the slot.

Since a single slot contained a tuple in various forms, it could "own"
multiple forms at a time and return whichever was requested when
ExecFetchSlotTuple() or ExecFetchSlotMinimalTuple() was called. Those
functions, however, differed from ExecMaterializeSlot() in the sense
that they returned a tuple even when the slot didn't own it.

With this work, every slot contains a tuple in "virtual" form, but
only one of the other two forms. Thus ExecMaterializeSlot() can not
create a heap tuple, and let the slot "own" it, in a slot that can not
contain a heap tuple. Neither ExecFetchSlotTuple() can return a tuple
from a slot which can not contain a heap tuple without consuming
memory (slots with minimal tuple can do some pointer swizzling and
save memory but that lump of memory is not exactly same as a heap
tuple memory lump.). Since such a heap tuple can not be "owned" by the
slot, the memory consumed by the tuple needs to be freed by the
caller. Similarly for ExecFetchSlotMinimalTuple().

Hence, in the new scheme the meaning of ExecMaterializeSlot(),
ExecFetchSlotTuple() and ExecFetchSlotMinimalTuple() needs to change.

ExecMaterializeSlot()
---------------------

ExecMaterializeSlot() should save the contents of the slot into its
native tuple format, whichever it is, so that it "owns" the tuple.

ExecFetchSlotTuple()
--------------------

ExecFetchSlotTuple() should just return the heap tuple when the
underlying slot can "own" a heap tuple and error out when slot can not
contain the requested format e.g. when ExecFetchSlotTuple() is called
on a "virtual" or "minimal" tuple slot. This is in-line with what we
are doing with ExecStore* functions. The earlier callers of
ExecMaterializeSlot(), which expected a heap tuple to be returned,
require to call ExecMaterializeSlot() (no return value) followed by
ExecFetchSlotTuple(). Instead of that ExecFetchSlotTuple() is changed
to accept a second argument "materialize". When it's true, the
returned heap tuple is materialized and "owned" by the slot. When it's
false, the returned heap tuple may not be "owned" by the slot. All the
callers of ExecFetchSlotTuple() can be managed to pass a
HeapTupleTableSlot or a BufferHeapTupleTableSlot. Instead of relying
on the TupleTableSlotType (which may be expanded later for user
defined TupleTableSlots) we rely on the presence get_heap_tuple()
callback in the given slot. That callback is expected to return a heap
tuple as is from the slot.

In a few cases, ExecFetchSlotTuple() is called with a slot which is
filled by tuplestore_gettupleslot(). We use a separate tuple store to
store foreign table tuples for processing AFTER triggers and for
storing tuples returned by a function. They are stored in the tuple
store as minimal tuples. In both the cases, the minimal tuples are
converted to heap tuple (using earlier version of
ExecFetchSlotTuple()) before they are processed further after
obtaining those from a tuple store using tuplestore_gettupleslot().
With the tuple table store abstraction a minimal tuple can be stored
in a MinimalTupleTableSlot. Thus we have two options

1. Pass a MinimalTupleTableSlot to tuplestore_gettupleslot() and get a
minimal tuple in this slot. Convert the minimal tuple to a heap tuple
by allocating memory for the converted tuple. We have to explicitly
release the memory allocated for the heap tuple after it's been
processed. When do we do that exactly in AfterTriggerExecute() is not
clear. ExecFetchSlotTupleDatum() could do that right away.

2. Pass a HeapTupleTableSlot to tuplestore_gettupleslot() and let it
store the heap tuple crafted from minimal tuple, which can be freed
within the same function, if necessary. The heap tuple gets freed when
the slot is used to store the next tuple to be processed.

The second option looks better since it continues to use slot's
mechanism to "own" and free tuples that it "owns". Hence implemented
the same.

ExecFetchSlotMinimalTuple()
---------------------------

Before this work, a TupleTableSlot could "own" a minimal tuple as
well. Thus ExecFetchSlotMinimalTuple() returned a minimal tuple after
constructing and "owning" it if it was not readily available. When the
slot "owned" the minimal tuple, the memory consumed by the tuple was
freed when a new tuple was stored in it or the slot was cleared.

With this work, not all TupleTableSlot types can "own" a minimal
tuple. When fetching a minimal tuple from a slot that can not "own" a
tuple, memory is allocated to construct the minimal tuple, which needs
to be freed explicitly. Hence ExecFetchSlotMinimalTuple() is modified
to flag the caller whether it should free the memory consumed by the
returned minimal tuple.

Right now only a MinimalTupleTableSlot can own a minimal tuple. But we
may change that in future or a user defined TupleTableSlot type (added
through an extension) may be able to "own" a minimal tuple in it.
Hence instead of relying upon TTS_IS_MINIMAL() macro to tell us
whether the TupleTableSlot can "own" a minimal tuple or not, we rely
on the set of callbacks. A TupleTableSlot which can hold a minimal
tuple implements get_minimal_tuple callback. Other TupleTableSlot
types leave the callback NULL.

The minimal tuple returned by this function is usually copied into a
hash table or a file. But, unlike ExecFetchSlotTuple() it's never
written to. Hence the difference between signatures of the two
functions.

---

>
>>
>> I haven't done that, but I think we should split ExecStoreTuple() into
>> multiple versions, that only work on specific types of tuple
>> slots. E.g. seqscan et al would call ExecStoreBufferHeapTuple(), other
>> code dealing with tuples would call ExecStoreHeapTuple(). The relevant
>> callers already need to know what kind of tuple they're dealing with,
>> therefore that's not a huge burden.
>
> I thought so too, but haven't done that change right now. Will work on
> that. That's in my TODO list.

This is still a TODO.

0001-0005 are same as the previous patch set.

>
> 0006 This is improved version of your POC patch. I have changed the
> tuple table slot type in many places which showed up as regression
> failures. Added/modified a bunch of comments, improved error messages,
> corrected code at some places, addressed FIXME's at some places. Also
> added macros to check the TupleTableSlot's type based on
> TupleTableSlotOps set.

The changes to ExecMaterializeSlot(), ExecFetchSlotTuple() and
ExecFetchSlotMinimalTuple() discussed above are included in this
patch.

0007, 0008 are same as previous patch set.

>
> 0009 Replaces ExecMaterializeSlot as described above. I will work on
> this again and examine all the instances of ExecMaterializeSlot as
> described above.

This is merged with 0006.

>
> 0010 Index scan uses reorder queue in some cases. Tuples are stored
> reorder queue as heap tuples without any associated buffer. While
> returning a tuple from reorder queue, IndexNextWithReorder() should
> use TupleTableSlot of type HeapTupleTableSlot. A tuple returned
> directly from an index scan has a buffer associated with it, so should
> use
> TupleTableSlot of type BufferHeapTupleTableSlot. So, use different
> kinds of slots for an index scan.

This is 0009

>
>
> Next steps
> 1. Address TODO in the code. I have listed some of those above.

There are still a handful of TODOs in the patches. I will work on those next.

>
> 2. Right now we are using TupleTableSlotType, an enum, to create slots
> of required type. But extensions which want to add their own slot
> types won't be able to add a type in that enum. So, they will need to
> write their own MakeTupleTableSlot. That function uses the
> TupleTableSlotType to set TupleTableSlotOps and calculate the minimum
> size of slot. We could change the function to accept TupleTableSlotOps
> and the minimum size and it just works for all the extensions. Or it
> could just accept TupleTableSlotOps and there's a callback to
> calculate minimum memory required for the slot (say based on the tuple
> descriptor available).

This is still a TODO.

>
> 3. compile with LLVM and fix any compilation and regression errors.

When I compiled server with just 0003 applied with LLVM, the
compilation went well, but there was a server crash. That patch
changes type of tts_nvalid from int32 to AttrNumber. I tried debugging
the crash with a debug LLVM build, but couldn't complete the work.
Attached patch attrnumber_llvm_type.patch is my incomplete attempt to
fix that crash. I think, we should make it easy to change the data
types of the members in structures shared by JIT and non-JIT code, may
be automatically create both copies of the code somehow. I will get
back to this after addressing other TODOs.

>
> 4. We need to do something with the name ExecStoreVirtualSlot(), which
> is being (and will be) used for all kinds of TupleTableSlot type.
> Right now I don't have any bright ideas.

Still a TODO.

>
> 5. ExecFetch* functions are now one liners, so we could make those
> inline and keep those in header file like, say slot_getattr.

This is still a TODO.

>
> 6. ExecCopySlot can be a thin wrapper if we add a callback copyslot()
> and invoked on the destination slot type.

This is still a TODO.

>
> 7. slot_attisnull() deforms a heap/minimal tuple if that status for
> given attribute is not available tts_isnull. Should we instead add a
> callback attisnull() which can use something like heap_isnull()?

This is still a TODO.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_abstract_tts_patches_v2.tar.zip application/zip 51.8 KB
attrnumber_llvm_type.patch text/x-patch 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-07-05 10:45:52 Re: Should contrib modules install .h files?
Previous Message Amit Khandekar 2018-07-05 10:07:31 Re: AtEOXact_ApplyLauncher() and subtransactions