Re: Locale, Collation, ICU patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Locale, Collation, ICU patch
Date: 2008-04-03 19:03:34
Message-ID: 6587.1207249414@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> The big gotcha is what collation to use when comparing with data in the system
> tables, especially the shared system tables. I think we do need to define a
> database-wide encoding and collation to use for system tables.

You mean cluster-wide? If we can get away with that, it'd solve a lot
of problems.

Note that the stuff in the system tables is mostly type "name" not text,
and the comparison semantics for that have always been strcmp(), so the
question of collation doesn't really apply. Name in itself doesn't care
about encoding either, but I think we have to restrict encoding to avoid
the problem of injecting data that's invalidly encoded into one database
from another via the shared catalogs.

The other issue that'd have to be resolved is the problem of system log
output. I think we'd wish that log messages are written in a uniform
encoding (CSV output in particular is going to have a hard time
otherwise) but what do you do when you need to report something that
includes a character not present in that encoding?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Darcy Buskermolen 2008-04-03 19:03:43 Re: modules
Previous Message Tom Lane 2008-04-03 18:53:33 Re: best way for export gram.y symbols