Re: Unicode + LC_COLLATE

From: "John Sidney-Woollett" <johnsw(at)wardbrook(dot)com>
To: "Priem, Alexander" <ap(at)cict(dot)nl>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Unicode + LC_COLLATE
Date: 2004-04-23 06:32:56
Message-ID: 4919.192.168.0.64.1082701976.squirrel@mercury.wardbrook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Priem, Alexander said:
> Would lc-collate=C be bad in combination with UNICODE encoding? What
> lc-collate setting would you recommend for UNICODE encoding which will
> provide good sorting for all (most) common languages? (dutch, english,
> french, german)

It seems that LC_COLLATE=C is not a good idea when using UTF-8...

On my db server, /etc/sysconfig/i18n contains
LANG="en_GB.UTF-8"
SUPPORTED="en_GB.UTF-8:en_GB:en:en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"

and locale -a produces

C
[..snip..]
en_GB
en_GB.iso885915
en_GB.utf8
[..snip..]
en_US
en_US.iso885915
en_US.utf8
[..snip..]
POSIX

and locale produces

locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=

QUESTION: Because I want UNICODE encoding/support in postgres, does that
mean that when I init my db I should specify (for en_GB support)

initdb -E UNICODE <-- (NO LOCALE, default to en_GB)

or

initdb --locale=en_GB -E UNICODE

or should I use

initdb --locale=en_GB.utf8 -E UNICODE

Basically I want the database to be able to support all character
encodings, and to sort according to the en_GB locale.

The monetary and date formats determined by the locale are irrelevant for
us, as we return correctly localised versions of this data from our web
app (which is connected to postgres) based on the user's selected language
for their current web browser session.

Thanks to anyone who can answer the initdb question above, hopefully the
answer will help others too.

John Sidney-Woollett

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tatsuo Ishii 2004-04-23 09:19:42 New replication software pgpool available
Previous Message Philipp Buehler 2004-04-23 05:35:36 Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly