Re: UTF8 or Unicode

From: Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Abhijit Menon-Sen <ams(at)oryx(dot)com>, List pgsql-hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: UTF8 or Unicode
Date: 2005-02-15 09:22:03
Message-ID: 1108459323.4044.171.camel@petra
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2005-02-14 at 22:05 -0500, Bruce Momjian wrote:
> Abhijit Menon-Sen wrote:
> > At 2005-02-14 21:14:54 -0500, pgman(at)candle(dot)pha(dot)pa(dot)us wrote:
> > >
> > > Should our multi-byte encoding be referred to as UTF8 or Unicode?
> >
> > The *encoding* should certainly be referred to as UTF-8. Unicode is a
> > character set, not an encoding; Unicode characters may be encoded with
> > UTF-8, among other things.
> >
> > (One might think of a charset as being a set of integers representing
> > characters, and an encoding as specifying how those integers may be
> > converted to bytes.)
> >
> > > I know UTF8 is a type of unicode but do we need to rename anything
> > > from Unicode to UTF8?
> >
> > I don't know. I'll go through the documentation to see if I can find
> > anything that needs changing.
>
> I looked at encoding.sgml and that mentions Unicode, and then UTF8 as an
> acronym. I am wondering if we need to make UTF8 first and Unicode
> second. Does initdb accept UTF8 as an encoding?

in PG: unicode = utf8 = utf-8

Our internal routines in src/backend/utils/mb/encnames.c accept all
synonyms. The "official" internal PG name for UTF-8 is "UNICODE" :-(

It's historical reason that UTF8 = UNICODE, because there was "UNICODE"
first. It's same like "WIN" for WIN1251 (in sources it's marked as
"_dirty_ alias")...

I think initdb uses pg_char_to_encoding() from
src/backend/utils/mb/encnames.c and it should be accept all aliases.

Karel

--
Karel Zak <zakkr(at)zf(dot)jcu(dot)cz>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message pgsql 2005-02-15 12:32:41 Re: I will be on Boston
Previous Message Christopher Kings-Lynne 2005-02-15 09:18:21 Re: Help me recovering data