From: | Nicolaus Erichsen <nico(dot)erichsen(at)hsh-berlin(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Issues with german 'Umlaute' |
Date: | 2002-10-17 15:06:36 |
Message-ID: | 200210171706.36054.nico.erichsen@hsh-berlin.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello everybody,
I recently found a problem with sorting german 'Umlaute' . I hope the encoding
of this mail works ;-) :
Postgres puts Umlaute (i.e., ÄäÖöÜü) at the very end of the Alphabet, and
this is not the way it should be. I didn't check for the special Character
'ß', but its probably similar.
The canonical sort order for Umlaute is to treat them as two characters, like
this:
ä -> ae
ö -> oe
ü -> ue
ß -> ss
( and the same for upper case 'ÄÖÜ'. 'ß' does not have an upper case )
Well, I guess this might be difficult to implement and might have quite an
impact on performance. The solution I know from other databases consists of
inserting ä after a, ö after o, ü after u and ß after s. Afaik this is
generally accepted.
upper() does not handle Umlaute correctly as well. It leaves äöü unchanged
instead of converting them to upper case.
All this happens with a database created with encoding ='latin1'. If there
are better results with a different encoding (I didn't try it yet), I'd
suggest adding some information about this in the documentation.
Thanks for your work,
N.Erichsen
--
HSH Soft-und Hardware Vertriebs GmbH
Rudolf-Diesel-Straße 2 - 16321 Lindenberg
Tel. (030) 94004 - 509 Fax (030) 94004 - 400
From | Date | Subject | |
---|---|---|---|
Next Message | Brian Macy | 2002-10-17 17:47:20 | Pg 7.2.3 int8 value out of range |
Previous Message | Nicolaus Erichsen | 2002-10-17 14:45:07 | 'pg_dump --create' forgets database encoding |