Re: [HACKERS] UTF8 or Unicode

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, tgl(at)sss(dot)pgh(dot)pa(dot)us, dpage(at)vale-housing(dot)co(dot)uk, oliver(at)opencloud(dot)com, zakkr(at)zf(dot)jcu(dot)cz, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] UTF8 or Unicode
Date: 2005-02-27 04:09:43
Message-ID: 200502270409.j1R49hc08394@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Here is an updated version that handles all cases. It does rename the
routine names so the primary encoding name is used for the routine
names. This will be documented in the release notes if anyone actually
uses those names in their code.

This patch requires renaming of the utf8_and_tcvn directory so it will
not apply cleanly.

I left the routines named utf_8 alone because the code splits encoding
names at breaks, like this iso_8859_7_to_utf_8. I assume that is OK.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Am Freitag, 25. Februar 2005 05:51 schrieb Bruce Momjian:
> > > so I see what he is saying. We are not consistent in favoring the
> > > official names vs. the common names.
> > >
> > > I will work on a patch that people can review and test.
> >
> > I think this is what we should do:
> >
> > UNICODE => UTF8
> > ALT => WIN866
> > WIN => WIN1251
> > TCVN => WIN1258
> >
> > That should clear it up.
>
> OK, here is a patch that makes those changes.
>
> The only uncertainty I have is with the the use of the TCVN conversion
> routine names, e.g.:
>
> SELECT CONVERT('foo' USING tcvn_to_utf_8);
>
> I assume this is the same as:
>
> SELECT CONVERT('foo', 'WIN1258', 'UTF8');
> and
> SELECT CONVERT('foo', 'TCVN', 'UTF8'); -- alias usage
>
> So, why would people use the routine name? Both forms are documented.
> The first one with USING does not accept aliases, while the others do.
>
> I think this should be renamed to win1258_to_utf_8. However, this would
> be an incompatibility. We should mention it in the release notes.
>
> Other than that the other conversion files were already named fine, e.g.
> ascii_to_utf_8 (no UNICODE), however it is utf_8 and not utf8. I am
> unsure how to handle these.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 187.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message lsunley 2005-02-27 04:50:12 Re: [HACKERS] UTF8 or Unicode
Previous Message Bruce Momjian 2005-02-27 01:07:22 Re: [HACKERS] UTF8 or Unicode

Browse pgsql-patches by date

  From Date Subject
Next Message lsunley 2005-02-27 04:50:12 Re: [HACKERS] UTF8 or Unicode
Previous Message Bruce Momjian 2005-02-27 01:07:22 Re: [HACKERS] UTF8 or Unicode