Re: TupleTableSlot abstraction

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: TupleTableSlot abstraction
Date: 2018-06-11 12:43:52
Message-ID: CAFjFpReRif2b1FSgvv7=NpxxdLaDHkDet3qrqcXB68EknRjp-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 21, 2018 at 4:13 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:

>
> In my POC I turned TupleTableSlot into basically an abstract base class:

Here's set of patches which move this POC forward towards completion,
not yet complete though.

Here's what has changed.
1. Fixed all the regression failures while running make check from
regress and postgres_fdw.
2. Added/modified a bunch of comments about the new TupleTableSlot
structure. Modified comments in existing functions working with
TupleTableSlots as per the new implementation. Added comments
explaining each callback in TupleTableSlotOps, and their
implementations.
3. Improved error messages in some TupleTableTypeSlot specific implementations.
4. Fixed some coding errors in the POC patch (which didn't show up as
a failure in regression)

>
> You might notice that I've renamed a few fields (tts_values -> values,
> tts_isnull -> nulls, tts_nvalid -> nvalid) that haven't actually changed
> in the above structs / the attached patch. I now think that's probably
> not a good idea, unnecessarily breaking more code than necessary.

Done that. All members of TupleTableSlot have their names starting with tts_

>
> After this change functions like slot_getattr are thin inline wrappers:
> +static inline Datum
> +slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull)
> +{
> + Assert(attnum > 0);
> +
> + if (attnum > slot->nvalid)
> + slot->cb->getsomeattrs(slot, attnum);
> +
> + Assert(attnum <= slot->nvalid);
> +
> + *isnull = slot->nulls[attnum - 1];
> + return slot->values[attnum - 1];
> +}
>
> Note that I've moved system attribute handling out of the slot_getattr
> path - there were a few paths accessing them via slot_getattr, and it's
> a lot of unnecessary branching.

The system attributes are accessed via heap_getsysattr() by passing
that function a tuple from HeapTupleTableSlot or
BufferHeapTupleTableSlot. Those places Assert that the slot being used
is either HeapTupleTableSlot or BufferHeapTupleTableSlot.
slot_getsysattr() which accessed system attributes through a lot is
now removed and replaced with heap_getsysattr as described.

>
> The slightly bigger issue is that several parts of the code currently
> require heaptuples they can scribble upon. E.g. nodeModifyTable.c
> materializes the tuple - those can be solved by using a local slot to
> store the tuple, rather than using the incoming one. In nearly all cases
> we'd still be left with the same number of tuple copy steps, as
> materializing the tuple with copy into the local storage anyway. A few
> other cases, e.g. intorel_receive(), just can use ExecCopySlotTuple() or
> such instead of materialize.

There is a patch changing callers of ExecMaterializeSlot() to use
ExecGetHeapTupleFromSlot() or ExecSaveSlot(), new functions added by
me. But I think that needs more work. I don't think the new functions
are necessary. We could use ExecCopySlotTuple(), ExecFetchSlotTuple()
in place of ExecGetHeapTupleFromSlot() appropriately. The callers who
just want the tuple and not necessarily a writable one, can use
ExecFetchSlotTuple(). Those who want a writable copy can use
ExecCopySlotTuple(). When the slot is *not* heap or buffer heap tuple
table slot, we materialize a heap tuple every time we want to get the
heap tuple out of the slot. If we are doing this multiple times
without changing the contents of the slot, we will leak memory. I have
TODOs to check whether that's a problem in practice. May be we should
use ExecCopySlotTuple() explicitly while fetching tuples from those
kinds of slot. Same is case while fetching a minimal heap tuple from
non minimal heap tuple slots.

>
>
> The biggest issue I don't have a good answer for, and where I don't like
> Haribabu's solution, is how to deal with system columns like oid, ctid,
> tableoid. Haribabu's patch adds storage of those to the slot, but that
> seems like quite the cludge to me. It appears to me that the best thing
> would be to require projections for all accesses to system columns, at
> the original level of access. Then we don't need any sort of special
> codepaths dealing with them. But that's not exactly a trivial change and
> has some added complications too (junkfilter type things).
>

I studied the code extensively to check whether there are any cases
where we fetch system columns using (Var nodes with varattno < 0) from
(headers of ) tuples produced by non-scan nodes (e.g. join,
aggregation). That's not the case. We do have instances of INNER,
OUTER Var with varattno < 0, but those are the cases when trigger
accesses OLD and NEW tuples. I have a patch to add tests for the same.

>
> I haven't done that, but I think we should split ExecStoreTuple() into
> multiple versions, that only work on specific types of tuple
> slots. E.g. seqscan et al would call ExecStoreBufferHeapTuple(), other
> code dealing with tuples would call ExecStoreHeapTuple(). The relevant
> callers already need to know what kind of tuple they're dealing with,
> therefore that's not a huge burden.

I thought so too, but haven't done that change right now. Will work on
that. That's in my TODO list.

Here's description of patches.
0001, 0002 are the same patches as submitted in [1].

0003 changes tts_nvalid from int to AttrNumber. int would take at
least 4 bytes whereas the maximum value that tts_nvalid can take is
restricted by AttrNumber which 2 byte long.

0004 removed TupleDescGetSlot() function, which is obsolete, but
retained for extensions. With TupleTableSlot abstraction, I think the
extensions need to change anyway, so removing that function
altogether.

0005 packs various boolean members from TupleTableSlot structure into
a uint16 tts_flags, with each bit for each one bool member. I think we
can remove SHOULDFREEMIN since SHOULDFREE on MinimalHeapTupleTableSlot
can be used in place of SHOULDFREEMIN. I will do that in the next
version. This reduces the size of TupleTableSlot structure and can be
committed as an independent patch. There's also possibility that only
the BufferHeapTupleTableSlot will have ~SHOULDFREE but all other slots
have SHOULDFREE set, so we can get rid of that flag altogether. But I
need to verify that.

0006 This is improved version of your POC patch. I have changed the
tuple table slot type in many places which showed up as regression
failures. Added/modified a bunch of comments, improved error messages,
corrected code at some places, addressed FIXME's at some places. Also
added macros to check the TupleTableSlot's type based on
TupleTableSlotOps set.

0007 adds tts_type member indicating TupleTableSlot type readily
avaialble for debugging. We may or may not commit this patch.

0008 changes the place where we reset expression context used for
processing RETURNING and ON CONFLICT clause. If a query has both the
clauses, we reset the same expression twice while processing same
input/output tuple. This means that some memory allocated while
processing ON CONFLICT and required by RETURNING will get freed before
the later gets at it. This looks like a bug in the existing code, but
I have not been able to create a reproduction for the same. It showed
up as a regression failure with TupleTableAbstraction changes.

0009 Replaces ExecMaterializeSlot as described above. I will work on
this again and examine all the instances of ExecMaterializeSlot as
described above.

0010 Index scan uses reorder queue in some cases. Tuples are stored
reorder queue as heap tuples without any associated buffer. While
returning a tuple from reorder queue, IndexNextWithReorder() should
use TupleTableSlot of type HeapTupleTableSlot. A tuple returned
directly from an index scan has a buffer associated with it, so should
use
TupleTableSlot of type BufferHeapTupleTableSlot. So, use different
kinds of slots for an index scan.

Next steps
1. Address TODO in the code. I have listed some of those above.

2. Right now we are using TupleTableSlotType, an enum, to create slots
of required type. But extensions which want to add their own slot
types won't be able to add a type in that enum. So, they will need to
write their own MakeTupleTableSlot. That function uses the
TupleTableSlotType to set TupleTableSlotOps and calculate the minimum
size of slot. We could change the function to accept TupleTableSlotOps
and the minimum size and it just works for all the extensions. Or it
could just accept TupleTableSlotOps and there's a callback to
calculate minimum memory required for the slot (say based on the tuple
descriptor available).

3. compile with LLVM and fix any compilation and regression errors.

4. We need to do something with the name ExecStoreVirtualSlot(), which
is being (and will be) used for all kinds of TupleTableSlot type.
Right now I don't have any bright ideas.

5. ExecFetch* functions are now one liners, so we could make those
inline and keep those in header file like, say slot_getattr.

6. ExecCopySlot can be a thin wrapper if we add a callback copyslot()
and invoked on the destination slot type.

7. slot_attisnull() deforms a heap/minimal tuple if that status for
given attribute is not available tts_isnull. Should we instead add a
callback attisnull() which can use something like heap_isnull()?

There are TODOs in the patches for 4, 5, 6, 7.

I will continue to work on these steps.

[1] https://www.postgresql.org/message-id/CAFjFpRerUFX=T0nSnCoroXAJMoo-xah9J+pi7+xDUx86PtQmew@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
pg_abstract_tts_patches.tar.zip application/zip 46.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-06-11 13:20:06 Re: Performance regression with PostgreSQL 11 and partitioning
Previous Message Etsuro Fujita 2018-06-11 11:49:11 Re: CF bug fix items