Re: Encoding, Unicode, locales, etc.

From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Encoding, Unicode, locales, etc.
Date: 2006-11-01 10:41:43
Message-ID: 20061101104143.GA4971@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Oct 31, 2006 at 11:47:56PM -0500, Tom Lane wrote:

> Because we depend on libc's locale support, which (on many platforms)
> isn't designed to switch between locales cheaply. The fact that we
> allow a per-database encoding spec at all was probably a bad idea in
> hindsight --- it's out front of what the code can really deal with.
> My recollection is that the Japanese contingent argued for it on the
> grounds that they needed to deal with multiple encodings and didn't
> care about encoding/locale mismatch because they were going to use
> C locale anyway. For everybody else though, it's a gotcha waiting
> to happen.

Could this paragraph be put into the docs and/or the FAQ,
please ? Along with the recommendation that if you require
multiple encodings for your databases you better had your OS
locale configured properly for UTF8 and use UNICODE
databases or do initdb with the C-locale.

> This stuff is certainly far from ideal, but the amount of work involved
> to fix it is daunting; see many past pg-hackers discussions.

Here are a few data points from my Debian/Testing system in
favour of not worrying too much about installed ICU size as
it is being used by other packages anyways:

libicu36
Reverse Depends:
openoffice.org-writer * OOo
openoffice.org-filter-so52
openoffice.org-core
libxerces27 * Xerces XML parser (Apache camp)
libboost-regex1.33.1
libboost-dbg

icu
Reverse Depends:
libicu36
libicu36
libxercesicu26 * Xerces, again
libxercesicu25
libicu28-dev
libicu28
libicu21c102
icu-i18ndata
icu-data
libwine * Wine

This, of course, does not decrease the work required to get
this going in PostgreSQL.

Thanks for the great work,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Teodor Sigaev 2006-11-01 13:26:36 Re: [HACKERS] Index greater than 8k
Previous Message Tomi NA 2006-11-01 10:41:13 Re: postgres import