Re: Variable length varlena headers redux

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable length varlena headers redux
Date: 2007-02-13 17:09:13
Message-ID: 871wku7znq.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> The point I'm trying to get across here is to do things one small step
> at a time; if you insist on a "big bang" patch then it'll never get
> done. You might want to go back and review the CVS history for some
> other big changes like TOAST and the version-1 function-call protocol
> to see our previous uses of this approach.

Believe me I'm not looking for a "big bang" approach. It's just that I've
found problems with any of the incremental approaches. I suppose it's
inevitable that "big bang" approaches always seem like the most perfect since
they automatically involve fixing anything that could be a problem.

You keep suggesting things that I've previously considered and rejected --
perhaps prematurely. Off the top of my head I recall the following four
options from our discussions. It looks like we're circling around option 4.

As long as we're willing to live with the palloc/memcpy overhead at least for
now given that we can reduce it by whittling away at the sites that use the
old macros, that seems like a good compromise and the shortest development
path except perhaps option 1.

And I take it you're not worried about sites that might not detoast a datum or
detoast one in the wrong memory context where previously they were guaranteed
it wouldn't generate a copy? In particular I'm worried about btree code and
plpgsql row/record handling.

I'm not sure what to do about the alignment issue. We could just never align
1-byte headers. That would work just fine as long a the data types that need
alignment don't get ported to the new macros. It seems silly to waste space on
disk in order to save a cpu memcpy that we aren't even going to be saving for
now anyways.

Option 1)

We detect cases where the typmod guarantees either a fixed size or a maximum
size < 256 bytes. In which case instead of copying the typlen from pg_type
into tupledesc and the table's attlen we use the implied attlen or store -3
indicating single byte headers everywhere.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple.

It doesn't help text or varchar(xxxx) at all even if we mostly store
small data in them.

Pros: it buys us 0-byte headers for things like numeric(1,0) and even char(n)
on single-byte encodings. It buys us 1-byte headers for most numerics
and char(n) or varchar(n).

Option 2)

We have heap_form_tuple/heap_deform_tuple compress and uncompress the headers.
The rule is that headers are always compressed in a tuple and never compressed
when at the end of a datum.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple

It requires restricting the range of 1-byte headers to 127 or 63 bytes
and always uses 1 byte even for fixed size data. (We could get 0-byte
headers for a small range (ascii characters and numeric integers up to
127) but then 1-byte headers would be down to 31 byte data.)

It implies a different varlena format for on-disk and in-memory

Pros: it works for text/varchar as long as we store small data.

It lets us use network byte order or some other format for the on-disk
headers without changing the macros to access in-memory headers.

Option 3)

We have the toaster (or heap_form_tuple as a shortcut) compress the headers
but delay decompression until pg_detoast_datum. tuples only contain compressed
headers but Datums sometimes point to compressed headers and sometimes
uncompressed headers just as they sometimes point to toasted data and
sometimes detoasted data.

Cons: This still implies pallocing and memcpying the datum at least for now

There could be cases where code expects to deform_tuple and be
guaranteed a non-toasted pointer into the tuple.

requires replacing VARATT_SIZEP with SET_VARLENA_LEN()

Pros: It allows for future macros to examine the compressed datum without
decompressing it.

Option 4)

We have compressed data constructed on the fly everywhere possible.

Cons: requires replacing VARATT_SIZEP and also requires hacking VARDATA to
find the data in the appropriate place. Might need an additional pair of
macros for backwards compatibility in code that really needs to
construct a 4-byte headered varlena.

fragility with risk of VARSIZE / VARDATA being filled in out of order

Requires changing header to be in network byte order all the time.

Pros: one consistent representation for varlena everywhere.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mark 2007-02-13 17:18:51 Re: Variable length varlena headers redux
Previous Message Bruce Momjian 2007-02-13 17:03:17 Re: TODO item: update source/timezone for 64-bit tz files