Skip site navigation (1) Skip section navigation (2)

Re: Packed short varlenas, what next?

From: Gregory Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Packed short varlenas, what next?
Date: 2007-02-27 15:16:46
Message-ID: 87zm6zeild.fsf@stark.xeocode.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > As I has mentioned earlier, I'm missing a plan to allow 8-byte varlena 
> > sizes.

Hm, change VARHDRSZ to 8 and change all the varlena data types to have an
int64 leading field? I suppose it could be done, and it would give us more
bits to play with in the codespace since then we could limit 4-byte headers to
128M or something. But yes, there are tons of places in the code that
currently do arithmetic on sizes using integers -- and often signed integers
at that.

But that's a change to what a *detoasted* datum looks like. My patch mainly
changes what a *toasted* datum looks like. (Admittedly after making more data
fall in that category than previously.) The only change to a detoasted datum
is that the size is stored in network byte order.

> For the moment I think it should be enough to expect that the patch
> allow for more than one format of TOAST pointer, so that if we ever did
> try to support 8-byte varlenas, there'd be a way to represent them
> on-disk.  Some of the alternatives that we discussed last year used up
> all of the "prefix space" and wouldn't have allowed expansion in this
> particular direction.

Ah yes, I had intended to include the bit-pattern choice in the list as well.

There are two issues there:

1) The lack of 2-byte patterns which is quite annoying as really *any* on-disk
   datum would fit in a 2-byte header varlena. However it became quite tricky
   to convert things to 2-byte headers, especially for compressed data, it
   would have made for a much bigger patch to tuptoaster.c and pg_lzcompress.
   And I became convinced that it was best to get the most important gain
   first, saving 2 bytes on wider tuples is less important than 3-6 bytes on
   narrow tuples.

2) The choice of encoding for toast pointers. Note that currently they don't
   actually save *any* space due to the alignment requirements of the OIDs.
   which seems kind of silly but I didn't see any reasonable way around that.
   The flip side is that gives us 24 bits to play with if we want to have
   different types of external pointers or more meta-information about the
   toasted data.

   One of the details here is that I didn't store the compressed bit anywhere
   for external toast pointers. I just made the macro compare the rawsize and
   extsize. If that strikes anyone as evil we could take a byte out of those 3
   padding bytes for flags and store a compressed flag there.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2007-02-27 15:52:05
Subject: Re: Packed short varlenas, what next?
Previous:From: Tom LaneDate: 2007-02-27 14:50:22
Subject: Re: Packed short varlenas, what next?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group