Re: Variable length varlena headers redux

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Bruce Momjian" <bruce(at)momjian(dot)us>, "Greg Stark" <gsstark(at)mit(dot)edu>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable length varlena headers redux
Date: 2007-02-13 17:02:53
Message-ID: 1470.1171386173@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Bruce Momjian" <bruce(at)momjian(dot)us> writes:
>> Uh, if the macros can read 1 and 4-byte headers, why do we need to
>> allocate memory for them?

> Because the macros can't read 1 and 4-byte headers. If they could we would
> have the problem with VARDATA for code sites that write to the data before
> they write the size.

The way I see this working is that VARDATA keeps its current behavior
and hence can only be used with datums that are known to be in
4-byte-header form; hence, to avoid breaking code that uses it,
PG_DETOAST_DATUM has to produce a 4-byte-header datum always.

After we have the infrastructure in place, we'd make a pass over
high-traffic functions to replace uses of PG_DETOAST_DATUM with
something that doesn't forcibly expand 1-byte-header datums, and replace
uses of VARDATA on the result with something that handles both header
formats (and would be unsuitable for generating result datums, since
it'd have to assume that the length is already filled in).

I don't see any good reason why datatype-specific functions would ever
need to generate the short-header format directly. The only point where
it's worth trimming the header size is during heap_form_tuple, and we
can do it there at no significant efficiency cost. So uses of VARDATA
in connection with building a new datum need not be touched.

I'm inclined also to suggest that VARSIZE() need only support 4-byte
format: we could have a second macro that understands both formats and
gets used in the same high-traffic functions in which we are replacing
uses of VARDATA(). There's no benefit in making other sites support
1-byte format for VARSIZE() if they aren't going to support it for
VARDATA().

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-02-13 17:03:17 Re: TODO item: update source/timezone for 64-bit tz files
Previous Message Bruce Momjian 2007-02-13 17:01:06 Re: Variable length varlena headers redux