Re: Converting MySQL tinyint to PostgreSQL

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Joe <svn(at)freedomcircle(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dawid Kuroczko <qnex42(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Converting MySQL tinyint to PostgreSQL
Date: 2005-07-12 22:10:48
Message-ID: 20050712221048.GA15464@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jul 12, 2005 at 05:37:32PM -0400, Joe wrote:
> Tom Lane wrote:
> >Because the length specification is in *characters*, which is not by any
> >means the same as *bytes*.
> >
> >We could possibly put enough intelligence into the low-level tuple
> >manipulation routines to count characters in whatever encoding we happen
> >to be using, but it's a lot faster and more robust to insist on a count
> >word for every variable-width field.
>
> I guess what you're saying is that PostgreSQL stores characters in
> varying-length encodings.

It _may_ store characters in variable length encodings. It can use
fixed-length encodings too, such as latin1 or plain ASCII (actually,
unchecked 8 bits, which means about anything) -- you define that at
initdb time or database creation time, I forget. It would be painful
for the code to distinguish fixed-length from variable-length at
runtime, an optimization that would allow getting rid of the otherwise
required length word. So far, nobody has cared enough about it to do
the job.

> If it stored character data in Unicode (UCS-16) it would always take
> up two-bytes per character.

Really? We don't support UCS-16, for good reasons (we'd have to rewrite
several parts of the code in order to support '0' bytes embedded in
strings ... we use regular C strings extensively).

However we do support Unicode as UTF-8, but it's been said a couple of
times that characters can be wider than 2 or 3 bytes in some cases. So,
I don't see how UCS-16 could always use only 2 bytes.

> Have you considered supporting NCHAR/NVARCHAR, aka NATIONAL character
> data?

There have been noises, but so far nobody has stepped up the plate to do
the work.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe 2005-07-12 22:27:47 Re: Converting MySQL tinyint to PostgreSQL
Previous Message Tom Lane 2005-07-12 22:03:03 Re: getting the ranks out of items with SHARED