Re: server/db encoding (mix) issues

From: Jan-Peter(dot)Seifert(at)gmx(dot)de
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: server/db encoding (mix) issues
Date: 2008-09-08 08:57:58
Message-ID: 20080908085758.36590@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello Peter,

thank you very much for your quick reply.

> Datum: Thu, 04 Sep 2008 16:46:33 +0300
> Von: Peter Eisentraut <peter_e(at)gmx(dot)net>
> An: Jan-Peter Seifert <Jan-Peter(dot)Seifert(at)gmx(dot)de>
> CC: Postgres <pgsql-admin(at)postgresql(dot)org>
> Betreff: Re: [ADMIN] server/db encoding (mix) issues

> Jan-Peter Seifert wrote:
> > we have a mix of older software still using LATIN1 as db encoding and
> the psqlODBC-drivers (ANSI) and newer software using UTF8 as db encoding. As
> running two server instances would use up more resources(?) than just one
> we'd like to have all dbs in one cluster. Which cons against this solution
> are there? Which operating system locale should be used then? C locale is
> recommended in the docs - also because of better performance. However, the
> language of the software is not English but German - so shouldn't there be
> problems with sorting German Umlauts etc. correctly etc.? Which encoding
> should the server have - UTF8/Unicode or LATIN1? BTW which is the correct
> locale for LATIN1 and German (de_DE (my guess) or de_DE(at)euro (which seems to be
> for LATIN9)). Using SQL_ASCII doesn't seem to be a wise choice. Are there
> no problems when connecting with psqlODBC-ANSI drivers if the server
> encoding is UTF8/Unicode? I'd be happy if you could enlighten me a bit.
>
> Set your locale to de_DE.utf8 and use UTF8 as server encoding.

Well - I did setup two instances of 8.3.3 on an Ubuntu 7.10 system last week - both under a different user account. I set the locale for each account in the .bashrcs ("export LANG=de_DE" and "export LANG=de_DE.UTF-8" respectively). After that I ran initdb ("initdb --encoding='LATIN1' -W -A md5 -D $PGDATA" and "initdb --encoding='UTF8' -W -A md5 -D $PGDATA"(?)). I'm not sure whether I specified the server encoding for the UTF8-instance though. Did I make something wrong?
However, when I try to create an UTF-8 db in the LATIN1 server or an LATIN1 db in the UTF-8 server I get the error that the db encoding does not match the server locale and that the LC_TYPE-Locale requires the encoding of the server. Before that I thought it just fails because there is no locale with the name LATIN1 in windows. Are those additional encoding checks in v8.3.3 or had they been put in place with v8.3.1 already?
This makes me wonder whether there are any problems with migrating the LATIN1 databases to UTF8, but still using the psqlODBC-ANSI drivers for connecting for the non-unicode-capable applications. A quick test worked, but ...

> I would be interested to know where the documentation "recommends" using
> the C locale. That would certainly not be reasonable for many uses.

It isn't really recommended:
http://www.postgresql.org/docs/8.3/static/release-8-3.html
But the consequences could maybe pointed out more clearly.

http://www.postgresql.org/docs/8.3/interactive/locale.html
"The drawback of using locales other than C or POSIX in PostgreSQL is its performance impact. It slows character handling and prevents ordinary indexes from being used by LIKE. For this reason use locales only if you actually need them."

Thank you very much,

Peter
--
GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Peter Eisentraut 2008-09-08 09:40:50 Re: server/db encoding (mix) issues
Previous Message Jumping 2008-09-08 03:35:43 update to 8.3.3