Re: VARIANT / ANYTYPE datatype

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: VARIANT / ANYTYPE datatype
Date: 2011-05-04 23:24:00
Message-ID: 4DC1E010.7090501@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/04/2011 07:05 PM, Tom Lane wrote:
> Alvaro Herrera<alvherre(at)commandprompt(dot)com> writes:
>> Excerpts from Tom Lane's message of mié may 04 14:36:44 -0300 2011:
>>> Just out of curiosity, what actual functionality gain would ensue over
>>> just using text? It seems like doing anything useful with the audit
>>> table contents would still require casting the column to text, or the
>>> moral equivalent of that.
>> Storage efficiency. These people have really huge databases; small
>> changes in how tight things are packed makes a large difference for
>> them. (For example, we developed a type to store SHA-2 digests in a
>> more compact way than bytea mainly because of this reason. Also, at
>> some time they also wanted to apply compression to hstore keys and
>> values.)
> Hmm. The prototypical case for this would probably be a 4-byte int,
> which if you add an OID to it so you can resolve the type is going to
> take 8 bytes, plus you are going to need a length word because there is
> really no alternative to the "VARIANT" type being varlena overall, which
> makes it 9 bytes if you're lucky on alignment and up to 16 if you're
> not. That is not shorter than the average length of the text
> representation of an int. The numbers don't seem a lot better for
> 8-byte quantities like int8, float8, or timestamp. It might be
> marginally worthwhile for timestamp, but surely this is a huge amount of
> effort to substitute for thinking of a more compact text representation
> for timestamps.
>
> Pardon me for being unconvinced.
>
>

I'm far from convinced that storing deltas per column rather than per
record is a win anyway. I don't have hard numbers to hand, but my vague
recollection is that my tests showed it to be a design that used more space.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-05-04 23:39:54 Some surprising precedence behavior in PG's grammar
Previous Message David E. Wheeler 2011-05-04 23:23:31 Re: VARIANT / ANYTYPE datatype