Re: Converting MySQL tinyint to PostgreSQL

From: Dawid Kuroczko <qnex42(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Joe <svn(at)freedomcircle(dot)net>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Converting MySQL tinyint to PostgreSQL
Date: 2005-07-13 08:48:56
Message-ID: 758d5e7f050713014831121552@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/13/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Personally I would settle for a fuller set of small fixed size datatypes. The
> > "char" datatype is pretty much exactly what's needed except that it provides
> > such a quirky interface.
>
> I'm not actually against inventing an int1/tinyint type. I used to be
> worried that it would screw up the numeric datatype promotion hierarchy
> even more than it already was screwed up :-( ... but I think we have
> dealt with most of those issues now. It'd be worth trying anyway ---
> much more so than trying to optimize char(1), IMHO.

The problem with int1 type is that the smaller the value, the more
push for unsigned types... I think it may be worth doing, but is not
exactly the problem -- smallint is fine for most of situations. The
only place where I was unhappy with signed integers was... int4
(I wanted to put full 32bit unsigned values, so I had to use bigint,
with couple of millions of rows its a bit of a waste ;)).

As for the char/varchar type -- I was wondering. Worst case
scenario for UTF-8 (correct me on this) is when 1 character
takes 4 bytes. And biggest problem with char/varchar is that
length indicator takes 4 bytes... How much overhead would
it be to make a length variable, for example:

(var)char(1)-char(63) -- 1 byte length + string
char(64)-char(16383) -- 2 byte length + string
char(16384)-text -- 4 byte length + string, like now

This would reduce length of char(5) string from 9 bytes to
6 bytes, char(2) from 6 bytes to 3 bytes (for multibyte chars
it would be a win also).

I don't know the internals too well (read: at all), but I guess there
would be a problem of choosing which length of length to use --
would it be possible to make some sort of on-the-fly mapping
when creating tables -- varchar(224) is text_2bytelength,
text is text_4bytelength, char(1) is text_1bytelength...

Regards,
Dawid

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dawid Kuroczko 2005-07-13 08:58:54 Re: Converting MySQL tinyint to PostgreSQL
Previous Message Magnus Hagander 2005-07-13 08:47:06 Re: illegal sort order