From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Tomi NA <hefest(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: collation & UTF-8 |
Date: | 2006-02-24 17:29:13 |
Message-ID: | 20060224172913.GA9390@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Feb 24, 2006 at 06:23:07PM +0100, Tomi NA wrote:
> I'm using PosgreSQL 8.1.2 on linux and want to load UTF-8 encoded varchars.
> While I can store and get at stored text correctly, the ORDER BY places all
> accented characters (Croatian, in this case - probably marked hr_HR) after
> non-accented characters.
> This is no showstopper, but it does affect the general perception of
> application quality.
Collation is a function of the OS. Basically, is the locale of your
database setup for UTF-8 collation? It would probably be called
hr_HR.UTF-8.
> is there an official way to set up UTF8 collation so that "SELECT first_name
> FROM persons ORDER BY first_name" works as expected?
Yes, setup the locale correctly. In general, postgresql should give the
same results as sort(1) on the command-line. Use that to experiment.
LC_ALL=hr_HR.UTF-8 sort < input > output
Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-02-24 17:33:06 | Re: ltree + gist index performance degrades significantly over a night |
Previous Message | Tomi NA | 2006-02-24 17:23:07 | collation & UTF-8 |