Re: Translations in the distributions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Dennis Björklund <db(at)zigo(dot)dhs(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Translations in the distributions
Date: 2004-01-09 22:40:49
Message-ID: 26750.1073688049@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> It might already help if we allowed LC_CTYPE to be attached to a
> database rather than the entire cluster, and make the user match them
> up manually. The only drawback would be that indexes on global tables
> involving upper() or lower() would no longer work reliably.

Make that "indexes on global tables involving any text wouldn't work".
Everyone has to have the same notion of the sort order, or the index is
corrupt from someone's point of view, and soon from everyone's point of
view. upper/lower isn't needed to cause a problem.

However ... we do not have any global tables with indexed text columns.
Only name columns, and name comparisons are presently not locale-aware
(they're just strncmp()). I think it wouldn't be unreasonable to
legislate that this remain true forevermore, and then it would be safe
to allow different DBs to run in different locales. That would be a big
step forward, for sure.

[ thinks more... ] Actually it's a bigger restriction than that.
Imagine that you create some tables with text data in template1, and
then index them. The indexes would be corrupt if you cloned template1
and assigned the result a different locale. So to make this work, we'd
actually need the following restrictions:

* No system table can ever have an index on a text/varchar/char column;
only name columns, and name has to remain locale-unaware.

* You can't assign a new locale to a cloned database if the source has
any text/varchar/char indexes.

The simplest implementation restriction I can think of to guarantee
point 2 is to allow changing the locale only when cloning template0,
not when cloning anything else. Or we could just warn people that
they'd better reindex after changing the locale.

It does seem like this might be a reasonable path to take. Thoughts?

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2004-01-09 22:50:37 Re: Encoding problems in PostgreSQL with XML data
Previous Message Mark Feit 2004-01-09 22:19:29 Current-stream read for psql's \copy