Re: UTF8 or Unicode

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Abhijit Menon-Sen <ams(at)oryx(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UTF8 or Unicode
Date: 2005-02-15 03:05:08
Message-ID: 200502150305.j1F358K20470@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Abhijit Menon-Sen wrote:
> At 2005-02-14 21:14:54 -0500, pgman(at)candle(dot)pha(dot)pa(dot)us wrote:
> >
> > Should our multi-byte encoding be referred to as UTF8 or Unicode?
>
> The *encoding* should certainly be referred to as UTF-8. Unicode is a
> character set, not an encoding; Unicode characters may be encoded with
> UTF-8, among other things.
>
> (One might think of a charset as being a set of integers representing
> characters, and an encoding as specifying how those integers may be
> converted to bytes.)
>
> > I know UTF8 is a type of unicode but do we need to rename anything
> > from Unicode to UTF8?
>
> I don't know. I'll go through the documentation to see if I can find
> anything that needs changing.

I looked at encoding.sgml and that mentions Unicode, and then UTF8 as an
acronym. I am wondering if we need to make UTF8 first and Unicode
second. Does initdb accept UTF8 as an encoding?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-02-15 03:06:02 Re: 8.0.X and the ARC patent
Previous Message Joshua D. Drake 2005-02-15 02:56:49 Re: 8.0.X and the ARC patent