Re: Collations and Replication; Next Steps

From: Greg Stark <stark(at)mit(dot)edu>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: Matthew Kelly <mkelly(at)tripadvisor(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Matthew Spilich <mspilich(at)tripadvisor(dot)com>
Subject: Re: Collations and Replication; Next Steps
Date: 2014-09-17 14:57:38
Message-ID: CAM-w4HO_zwwo3eGgX0Lm_0607q5YZ1Pe-K2OqGiVYCqB4N7umQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 17, 2014 at 3:47 PM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
> I don't think we cannot achieve that because even MySQL accomplishes:-)

We've always considered it an advantage that we're consistent with the
collations in the rest of the system. Generally speaking the fact that
Postgres integrates with the system rather than be a separate system
unto itself.

Consider bug reports like "I've configured my system to use
fr_FR.UTF-8 and "sort" produces output in this order why is Postgres
producing output in a different order? Or extension authors using
strcoll and being surprised that the module gets inconsistent data
from the database.

Separately we always had a huge problem with ICU that it depended on
storing everything in a UCS-16 native encoding and required converting
to and from UTF-8 using an iterator interface. I heard that improved
somewhat but from what I understand it would be a struggle to avoid
copying every string before using it and consuming twice as much
memory. No more using strings directly out of disk buffers.

Then there's the concern that ICU is a *huge* dependency. ICU is
itself larger than the entire Postgres install. It's a big burden on
users to have to install and configure a second collation library in
addition to the system library and a complete non-starter for embedded
systems or low-memory systems.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emanuel Calvo 2014-09-17 15:18:13 Re: printing table in asciidoc with psql
Previous Message Dev Kumkar 2014-09-17 14:54:42 Re: [SQL] pg_multixact issues