Re: Confused about locales

From: "Tomi NA" <hefest(at)gmail(dot)com>
To: "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Confused about locales
Date: 2006-08-30 13:38:09
Message-ID: d487eb8e0608300638i17c9c5d6ya58845dcaa2b01dd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 8/19/06, John Gunther <owner(at)bucksvsbytes(dot)com> wrote:
> I've been reading about locales, encodings, sort orders, the to_ascii
> function and, embarrasingly, I'm more confused than enlightened.:
>
> What I want is very simple:
> 1) I want the database to correctly accept, store, and display
> alphabetic characters, including European accented characters, in HTML
> forms.
> 2) I want sorting to ignore the diacritical marks so that, for example,
> u, u-accent, and u-umlaut are all sorted as if they were plain u.
> 3) I want sorting to ignore non-alphanumerics, letter case, and white space.
>
> To illustrate, the following data is in sorted order:
>
> St-Émile
> stendahl
> st ènders
> St. Epson
>
> Can someone tell me what combination of PostgreSQL and Linux settings I
> need for this? It seems like a very basic question, but I'm just dense,
> I guess. I've tried a half dozen time-consuming configs without success.

Well, you'll obviously have to use UTF if you plan on supporting more
then one language with different accented characters. The sorting
issue is a bit of a problem, though. Pgsql uses the same collation in
all databases in a database cluster (carved into stone at cluster
init) so I don't know of a good way you could collate your data....you
could concievably keep a copy of accented strings replacing the
accented characters with their non-accented counterparts as you see
fit and collate on that column, but that's not a very elegant way of
handling the problem, is it?
You might have more luck with another database like mysql 4.1+ (where
accent-insensitive UTF collation is directly supported), MS SQL (where
you can define encoding and collation settings at the database level,
and so concievably have a database for each language, if you know
exactly which languages you'll have) or Firebird (where you define an
encoding at the column level and can collate any way you wish in each
column).

Hope I've helped,
t.n.a.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-08-30 13:48:22 Re: PostgreSQL on system with root as only user
Previous Message Andrew Kelly 2006-08-30 13:36:24 Re: PostgreSQL on system with root as only user