Refined LC_COLLATE or multiple database clusters?

From: Grega Bremec <grega(dot)bremec(at)noviforum(dot)si>
To: pgsql-admin(at)postgresql(dot)org
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Refined LC_COLLATE or multiple database clusters?
Date: 2004-06-09 10:33:03
Message-ID: 20040609103303.GA4016@elbereth.noviforum.si
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

Hello, List,

I recently stumbled across a problem that I can't really get across.

We have a database cluster (PG 7.4.2) that was initialized as follows:

$ pg_controldata /data/dir
pg_control version number: 72
Catalog version number: 200310211
Database cluster state: in production
pg_control last modified: sre 09 jun 2004 03:00:26 CEST
Current log file ID: 20
Next log file segment: 63
... <snip irrelevant checkpoint info> ...
Database block size: 8192
Blocks per segment of large relation: 131072
Maximum length of identifiers: 64
Maximum number of function arguments: 32
Date/time type storage: 64-bit integers
Maximum length of locale name: 128
LC_COLLATE: C
LC_CTYPE: C

Inside that cluster, there are several databases using different encodings:

template1=# \l
List of databases
Name | Owner | Encoding
-------------+----------+----------
db1 | ownera | UNICODE
db2 | ownera | UNICODE
db3 | ownerb | LATIN2
db4 | ownerc | LATIN2
db5 | ownera | LATIN2
db6 | ownera | LATIN2
template0 | postgres | UNICODE
template1 | postgres | UNICODE

Collate order for those databases, however, needs to be different. Obviously,
db3, db4, db5 and db6 will want to use some collate ordering scheme based on
ISO-8859-2, whereas the other two could possibly have table- or even column-
based collate requirements, as they contain unicode data in UTF-8 encoding,
which doesn't give any warranties wrt the contents of these databases.

Producing a list of requirements and/or imposing certain conventions on the
format of data stored in those tables is outside the scope of my authorities,
the only reasonable assumption I can make is that these databases could be
limited to one collating order per database (or several databases, as it is)
without much hassle.

Also, running several postmasters on this same machine is not an option, as
it only has 1.5GB RAM, of which only 1GB is available for cache (~260MB is
swapped after roughly a month's uptime, but that doesn't change much after
settling down in a week or two).

My question to the list would be the following:

Is it possible to do either of these things that could solve this problem
adequately:

- somehow manage to make one postmaster run on top of two separate
database clusters that would each have a different collate ordering
scheme

- use some other method of initializing one database from a different
template and taking with it LC_COLLATE setting (I suppose not, as
the "${PGDATA}/global/" directory is global to the cluster)

- use a patch that would add such functionality or upgrade to a version
(even if release-candidate, beta is not really an option, i gather)
of PostgreSQL that supported it

- in absence of any other viable solution, change the global setting of
the database cluster without having to dump/reinitdb/restore it

I thank you in advance for your valuable input,
--
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2004-06-09 13:01:51 Re: [Fwd: Re: RHDB just sits and does nothing?]
Previous Message lise chhay 2004-06-09 10:26:11 Is the server running locally and accepting connections on Unix domain socket /tmp/.s.PGSQL.5432 ?

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Hallgren 2004-06-09 11:04:20 Re: Question regarding dynamic_library_path
Previous Message Greg Sabino Mullane 2004-06-09 10:19:52 Improving postgresql.conf