Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] UTF8 or Unicode

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Markus Bertheau ?" <twanger(at)bluetwanger(dot)de>,Peter Eisentraut <peter_e(at)gmx(dot)net>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>,dpage(at)vale-housing(dot)co(dot)uk, oliver(at)opencloud(dot)com, zakkr(at)zf(dot)jcu(dot)cz,PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] UTF8 or Unicode
Date: 2005-03-02 17:54:20
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-patches
Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
>> The correct encoding name is "UTF-8".

> True, but Peter says the ANSI standard calls it UTF8 so that's what I
> used.

What SQL99 actually says is

         -  UTF8 specifies the name of a character repertoire that consists
            of every character represented by The Unicode Standard Version
            2.0 and by ISO/IEC 10646 UTF-8, where each character is encoded
            using the UTF-8 encoding, occupying from 1 (one) through 6

That is, "UTF8" is an identifier chosen to refer to an encoding which
they know perfectly well is really called UTF-8.  We should probably
follow the same convention of using UTF8 in code identifiers and UTF-8
in documentation.  In particular, UTF_8 with an underscore is sanctioned
by nobody and should be avoided.

			regards, tom lane

In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2005-03-02 18:16:10
Subject: Re: Vacuum time degrading
Previous:From: Andreas PflugDate: 2005-03-02 17:35:45
Subject: Re: logging as inserts

pgsql-patches by date

Next:From: Bruce MomjianDate: 2005-03-02 18:16:27
Subject: Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression
Previous:From: Stefan HansDate: 2005-03-02 17:35:01
Subject: typos in the docu

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group