Quick Links

Re: Ideas needed: How to create and store collation tables

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Ideas needed: How to create and store collation tables
Date:	2002-11-18 20:44:25
Message-ID:	1665.1037652265@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I am trying to figure out which is the best way to store custom collation
> tables on a PostgreSQL server system, and what kind of interface to
> provide to users to allow them to create their own.

> A collation table essentially consists of a mapping 'character code ->
> weight' for every character in the set and some additional considerations
> for one-to-many and many-to-one mappings, plus a few feature flags.

I'd be inclined to handle it similarly to the way that Tatsuo did with
conversion_procs: let collations be represented by comparison functions
that meet some suitable API. I think that trying to represent such a
table as an SQL table compactly will be a nightmare, and trying to
access it quickly enough for reasonable performance will be worse. Keep
the problem out of the API and let each comparison function do what it
needs to do internally.

> Secondly, because each collation table depends on a particular character
> encoding (since it is indexed by character code), some sort of magic needs
> to happen when someone creates a database with a different encoding than
> the template database. One option is to do some mangling on the
> registered external file name (such as appending the encoding name to the
> file name). Another option is to have the notional pg_collate system
> catalog contain a column for the encoding, and then simply ignore all
> entries pertaining to encodings other than the database encoding.

SQL92 says that any particular collation is applicable to only one
character set (which is their term that matches our "encoding"s).
So I think we'd definitely want to associate a character set with each
pg_collation entry, and then ignore any entries that don't match the
DB encoding. (Further down the road, "the" DB encoding might change
into just a "default for tables in this DB" encoding, meaning that we'd
need access to collations for multiple encodings anyway.)

regards, tom lane

In response to

Ideas needed: How to create and store collation tables at 2002-11-18 18:10:12 from Peter Eisentraut

Responses

Re: Ideas needed: How to create and store collation at 2002-11-19 01:21:30 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Justin Clift	2002-11-18 21:51:17	Looking for a "Linux on Playstation 2" person to compile PostgreSQL RPM's
Previous Message	Stephan Szabo	2002-11-18 20:08:54	Re: Ideas needed: How to create and store collation tables