Re: Locale implementation questions

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: kleptog(at)svana(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Locale implementation questions
Date: 2006-06-14 18:49:25
Message-ID: 200606141849.k5EInPj17579@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Thead added to TODO.detail.

---------------------------------------------------------------------------

Tatsuo Ishii wrote:
> > 3. Compiled locale files are large. One UTF-8 locale datafile can
> > exceed a megabyte. Do we want the option of disabling it for small
> > systems?
>
> To avoid the problem, you could dynmically load the compiled
> tables. The charset conversion tables are handled similar way.
>
> Also I think it's important to allow user defined collate data. To
> implement the CREATE COLLATE syntax, we need to have that capability
> anyway.
>
> > 4. Do we want the option of running system locale in parallel with the
> > internal ones?
> >
> > 5. I think we're going to have to deal with the very real possibility
> > that our locale database will not be as good as some of the system
> > provided ones. The question is how. This is quite unlike timezones
> > which are quite standardized and rarely change. That database is quite
> > well maintained.
> >
> > Would people object to a configure option that selected:
> > --with-locales=internal (use pg database)
> > --with-locales=system (use system database for win32, glibc or MacOS X)
> > --with-locales=none (what we support now, which is neither)
> >
> > I don't think it will be much of an issue to support this, all the
> > functions take the same parameters and have almost the same names.
>
> To be honest, I don't understand why we have to rely on (often broken)
> system locales. I don't think building our own locale data is too
> hard, and once we make up it, the maintenace cost will be very small
> since it should not be changed regularly. Moreover we could enjoy the
> benefit that PostgreSQL handles collations in a corret manner on any
> platform which PostgreSQL supports.
>
> > 6. Locales for SQL_ASCII. Seems to me you have two options, either
> > reject COLLATE altogether unless they specify a charset, or don't care
> > and let the user shoot themselves in the foot if they wish...
> >
> > BTW, this MacOS locale supports seems to be new for 10.4.2 according to
> > the CVS log info, can anyone confirm this?
> >
> > Anyway, I hope this post didn't bore too much. Locale support has been
> > one of those things that has bugged me for a long time and it would be
> > nice if there could be some real movement.
>
> Right. We Japanese (and probably Chinese too) have been bugged by the
> broken mutibyte locales for long time. Using C locale help us to a
> certain extent, but for Unicode we need correct locale data, othewise
> the sorted data will be completely chaos.
> --
> SRA OSS, Inc. Japan
> Tatsuo Ishii
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-06-14 18:53:10 Re: Alternative variable length structure
Previous Message Bruce Momjian 2006-06-14 18:49:12 Re: Proof of concept COLLATE support with patch