|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|Subject:||"Truncated" tuples for tuple hash tables|
|Views:||Raw Message | Whole Thread | Download mbox|
While looking at the recently-noticed problem that HashAggregate nodes
store more columns of the input than they need to, I couldn't help
noticing how much of the hashtable space goes into HeapTuple header
overhead. A couple months ago we were able to get a useful improvement
in sorting by not storing unnecessary header fields in sort files, and
I'm strongly tempted to do the same in tuple hash tables.
Unlike the case with sort temp files, it's important to be able to
access the stored data without moving/copying it. So, not wishing to
duplicate all the tuple access machinery we have already, I'm
envisioning a compromise design that leaves a couple bytes on the table
but looks enough like a standard tuple to be directly usable.
Specifically, something like this:
typedef struct TruncatedTupleData
uint32 t_len; /* length of tuple */
char pad[...]; /* see below */
int16 t_natts; /* number of attributes */
... the rest matching HeapTupleHeaderData ...
The padding would be chosen such that the offset of t_natts would have
the same value modulo MAXIMUM_ALIGNOF as it has in HeapTupleHeaderData.
This ensures that if a TruncatedTuple is stored starting on a MAXALIGN
boundary, data within it is correctly aligned the same as it would be
in a normal tuple. With the current struct definitions, 2 bytes of
padding would be needed on all supported platforms, and a
TruncatedTuple would be 16 bytes shorter than a regular tuple.
However, because we are also eliminating a HeapTupleData struct, the
total savings in tuple hash tables would be 36 to 40 bytes per tuple.
To make use of a TruncatedTuple, we'd set up a temporary HeapTupleData
struct with its t_data field pointing 16 bytes before the start of the
TruncatedTuple. As long as the code using it never tries to access any
of the missing fields (t_xmin through t_ctid), this would work exactly
like a normal HeapTuple.
Going forward, we'd have to be careful to preserve the existing
field-ordering separation between transaction status fields and data
content fields in tuple headers, but that's just a matter of adding some
comments to htup.h.
It's tempting to think about using this same representation for tuples
stored in memory by tuplesort.c and tuplestore.c. That'd probably
require some changes in their APIs, but I think it's doable.
Comments? Anyone think this is too ugly for words?
regards, tom lane
|Next Message||Hannu Krosing||2006-06-26 14:47:53||Re: vacuum, performance, and MVCC|
|Previous Message||Hannu Krosing||2006-06-26 14:15:54||Re: vacuum, performance, and MVCC|