Re: Reducing data type space usage

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reducing data type space usage
Date: 2006-09-16 22:58:56
Message-ID: 87wt83wh8v.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:

> Gregory Stark wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>
>> Sure, this helps with CHAR(1) but there were plen
>
> OK.

Ooops, sorry, I guess I sent that before I was finished editing it. I'm glad
you could divine what I meant because I'm not entirely sure myself :)

> Well, if you are using TEXT, it is hard to say you are worried about
> storage size. I can't imagine many one-byte values are stored in TEXT.

Sure, what about "Middle name or initial". Or "Apartment Number". Or for that
matter "Drive Name" on a windows box. Just because the user doesn't want to
enforce a limit on the field doesn't mean the data will always be so large.

>> Part of the reason I think the smallfoo data types may be a bright idea in
>> their own right is that the datatypes might be able to do clever things about
>> their internal storage. For instance, smallnumeric could use base 100 where
>> largenumeric uses base 10000.
>
> I hardly think modifying the numeric routines to do a two different
> bases is worth it.

It doesn't actually require any modification, it's already a #define. It may
be worth doing the work to make it a run-time parameter so we don't need to
recompile the functions twice.

I'm pretty sure it's worthwhile as far as space conservation goes. a datum
holding a value like "10" currently takes 10 bytes including the length
header:

postgres=# select sizeof('10'::numeric);
sizeof
--------
10
(1 row)

That would go down to 7 bytes with a 1-byte length header. And down to 4 bytes
with base 100. Ie, reading a table full of small numeric values would be 75%
faster.

With some clever hacking I think we could get it to go down to a single byte
with no length header just like ascii characters for integers under 128. But
that's a separate little side project.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Lukas Kahwe Smith 2006-09-16 23:07:16 Re: [pgsql-www] Developer's Wiki
Previous Message Heikki Linnakangas 2006-09-16 22:18:52 Re: Reducing data type space usage