Re: Reducing the overhead of NUMERIC data

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Marcus Engene <mengpg(at)engene(dot)se>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing the overhead of NUMERIC data
Date: 2005-11-04 21:14:38
Message-ID: 20051104211438.GY9989@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Thu, Nov 03, 2005 at 04:07:41PM +0100, Marcus Engene wrote:
> Simon Riggs wrote:
> >On Thu, 2005-11-03 at 11:13 -0300, Alvaro Herrera wrote:
> >
> >>Simon Riggs wrote:
> >>
> >>>On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of
> >>>that datatype having a 4 byte varlena header. In this example, all of
> >>>those instantiations having the varlena header set to 12, so essentially
> >>>wasting the 4 byte header.
> >>
> >>We need the length word because the actual size in bytes is variable,
> >>due to multibyte encoding considerations.
> >
> >
> >Succinctly put, thanks.
> >
> >Incidentally, you remind me that other databases do *not* vary the
> >character length, even if they do have varying length UTF-8 within them.
> >So if you define CHAR(255) then it could blow up at a random length if
> >you store UTF-8 within it.
> >
> >That's behaviour that I could never sanction, so I'll leave this now.
> >
> >Best Regards, Simon Riggs
> >
>
> Just as a side note, in Oracle you can use the syntax (f.ex on on a db
> with utf-8 charset):
>
> column VARCHAR2(10 CHAR)
>
> ...to indicate that Oracle should fit 10 characters there. It might use
> up to 40 bytes in the db, but that's up to Oracle. If I s/10 CHAR/10, at
> most 10 characters will fit.
>
> This works very well. The only catch is that it's not good to use more
> than 1000 chars since oracle's varchars dont want to go past 4000 bytes.

Likewise other databases use different character types such as NCHAR
(nationalized char), which is the 16 bit variant.

I think it's perfectly acceptable to have a char type that is a
fixed-width in terms of number of bytes, so long as we provide an
alternative. Heck, in my experience char is only used to store things
like hashes that are in ASCII anyway.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2005-11-04 21:15:59 Re: roundoff problem in time datatype
Previous Message Mark Wong 2005-11-04 21:09:14 Re: Spinlocks, yet again: analysis and proposed patches

Browse pgsql-patches by date

  From Date Subject
Next Message Gurjeet Singh 2005-11-04 21:15:59 Re: roundoff problem in time datatype
Previous Message Bruce Momjian 2005-11-04 20:48:24 Re: AIX FAQ addition