Re: different sort order in windows and linux version

From: Agent M <agentm(at)themactionfaction(dot)com>
To: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: different sort order in windows and linux version
Date: 2006-07-02 16:25:43
Message-ID: 910202e1b6b4f76ae4f871e165d4e01f@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Jul 2, 2006, at 6:13 AM, Martijn van Oosterhout wrote:
> But I don't think anyone is actually considering importing ICU into the
> postgres source tree, are they?
Why not?

> Size - I'm not sure this is relevent since I don't think we want to
> incorporate it into postgres itself, just let people use it if they
> have it. In any case though, the default dataset is 8MB. This includes
> support for every locale and charset it knows about.
>
> If you drop the conversion stuff (because postgres already has that)
> you're down to about 4MB.
Why would you drop the ICU transcoding support instead of the existing
postgres functions? Why the duplicated effort?

>> Well, the Japanese think that UTF8 is not the solution to all their
>> worries, so they won't be happy with a UTF8-only solution. Likewise,
>> those of us who only need single-byte character sets won't be very
>> happy
>> with being forced to accept multi-byte processing overhead.
>
> I've not quite understood the japenese problem with Unicode. My
> understanding is that it was primarily due to widespread use of broken
> converters.

Certain Japanese characters cannot make a reliable round-trip through
Unicode. ICU uses UTF-16 as its store, so the Japanese folks won't be
happy with an ICU-only solution. However, it would still be of great
benefit to allow ICU to handle as much as possible, leaving the string
encodings to the encoding experts.

At the very least, it would be great to have ICU to handle encoding on
a per-column basis (perhaps extending the text datatype with encoding
info). Perhaps this would be a decent stopgap solution? The backend
protocol would also need a version bump- currently, it converts all
strings to a single encoding.

¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
agentm(at)themactionfaction(dot)com
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Victor Escobar 2006-07-02 16:29:54 Default directory for postgres user?
Previous Message Tom Lane 2006-07-02 15:26:49 Re: pgsql user change to postgres

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-07-02 16:29:55 Re: note about syntax for fillfactor patch
Previous Message Bruce Momjian 2006-07-02 16:24:57 Re: [COMMITTERS] pgsql: Do a pass of code review for the ALTER TABLE