Re: WIP patch: Collation support

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Radek Strnad <radek(dot)strnad(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: Collation support
Date: 2008-09-10 08:48:20
Message-ID: 20080910084820.GB27812@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 10, 2008 at 11:29:14AM +0300, Heikki Linnakangas wrote:
> Radek Strnad wrote:
> >- because of pg_collation and pg_charset are catalogs individual for each
> >database, if you want to create a database with collation other than
> >specified, create it in template1 and then create database
>
> I have to wonder, is all that really necessary? The feature you're
> trying to implement is to support database-level collation at first, and
> perhaps column-level collation later. We don't need support for
> user-defined collations and charsets for that.

Since the set of collations isn't exactly denumerable, we need some way
to allow the user to specify the collation they want. The only
collation PostgreSQL knows about is the C collation. Anything else is
user-defined.

> >Design & functionality changes left:
> >- move retrieveing collation from pg_database to pg_type
>
> I don't understand this item. What will you move?

Long term, the collation is a property of the type, but I agree, I'm not
sure why this patch needs it.

> That's a tricky one. One idea is to prohibit choosing a different
> collation than the one in the template database, unless we know it's
> safe to do so without reindexing.

But that put us back where we started: every database having the same
collation. We're trying to move away from that. Just reindex everything
and be done with it.

> Note that we already have the same problem with encodings. If you create
> a database with LATIN1 encoding, load it with data, and then use that as
> a template for a database with UTF-8 encoding, the text data will be
> incorrectly encoded. We should probably fix that too.

I'd say forbid more than one encoding in a cluster, but that's just my
opinion :)

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-09-10 08:57:14 Re: Synchronous Log Shipping Replication
Previous Message Heikki Linnakangas 2008-09-10 08:39:41 Re: Synchronous Log Shipping Replication