Re: Reducing data type space usage

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reducing data type space usage
Date: 2006-09-15 23:07:23
Message-ID: 200609152307.k8FN7Ns28260@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark wrote:
> Case 2) Data types that are different sizes depending on the typmod but are always
> the same size that can be determined statically for a given typmod. In the
> case of a ASCII encoded database CHAR(n) fits this category and in any case
> we'll eventually have per-column encoding. NUMERC(a,b) could also be made
> to fit this as well.
>
> In cases like these we don't need *any* varlena header. If we could arrange
> for the functions to have enough information to know how large the data
> must be.

I thought about the CHAR(1) case some more. Rather than restrict
single-byte storage to ASCII-encoded databases, I think there is a more
general solution.

First, I don't think any solution that assumes typmod will be around to
help determine the meaning of the column is going to work.

I think what will work is to store a 1-character, 7-bit ASCII value in
one byte, by setting the high bit. This will work for any database
encoding. This is the zero-length header case.

If the 1-character has a high bit, will require a one-byte length header
and then the high-bit byte, and if it is multi-byte, perhaps more bytes.

Zero-length header will even work for a VARCHAR(8) field that stores one
7-bit ASCII character, because it isn't relying on the typmod.

FYI, we also need to figure out how to store a zero-length string. That
will probably be high-bit, and then all zero bits. We don't store a
zero-byte in strings, so that should be unique for "".

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2006-09-15 23:10:50 Re: log_duration is redundant, no?
Previous Message Tom Lane 2006-09-15 22:59:54 Re: log_duration is redundant, no?