Re: ICU integration

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Doug Doole <ddoole(at)salesforce(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU integration
Date: 2016-09-09 09:31:39
Message-ID: CAMsr+YHqCAoUCSnkRGcFqNQgaMxzr6SbfihJoaEhUUfCv3OgaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9 September 2016 at 16:21, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Sep 8, 2016 at 6:19 PM, Peter Eisentraut
> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>>
>> On 9/8/16 11:16 AM, Tom Lane wrote:
>> > This is a problem, if ICU won't guarantee cross-version compatibility,
>> > because it destroys the argument that moving to ICU would offer us
>> > collation behavior stability.
>>
>> It would offer a significant upgrade over the current situation.
>>
>> First, it offers stability inside the same version. Whereas glibc might
>> change a collation in a minor upgrade, ICU won't do that. And the
>> postgres binary is bound to a major version of ICU by the soname (which
>> changes with every major release). So this would avoid the situation
>> that a simple OS update could break collations.
>>
>> Second, it offers a way to detect that something has changed. With
>> glibc, you don't know anything unless you read the source diffs. With
>> ICU, you can compare the collation version before and after and at least
>> tell the user that they need to refresh indexes or whatever.
>>
>
> +1 on the importance of this last part.
>
> We may not be able to handle it directly, but just being able to point out
> to the user that "this index is incorrect, you have to reindex" and then
> refuse to use the index until that has been done would be a *huge*
> improvement. And it would definitely help solve an existing real-world
> problem, which is what can happen when you restore a physical backup onto a
> different version of an operating system at least.
>
> Sure, it would be even better if we could automatically *deal* with it. But
> failing in a loud and obvious way is a *lot* better than silently returning
> incorrect data...

Yep, I strongly agree. That's part of why I think this is well worth
doing even though it doesn't look like it'll give us a full solution
for stable collations.

It's likely also a step or three toward case-insensitive
locales/collations, which I'm sure many people would like. Though far
from the whole picture.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-09-09 09:47:17 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Simon Riggs 2016-09-09 09:23:28 Re: Quorum commit for multiple synchronous replication.