Re: ICU integration

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Doug Doole <ddoole(at)salesforce(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU integration
Date: 2016-09-07 01:38:19
Message-ID: CAM3SWZTLHsgFryPcOFFTCe-TTP42y7kwfrErxM1Fk6uTC=KCfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 6, 2016 at 10:40 AM, Doug Doole <ddoole(at)salesforce(dot)com> wrote:
> - Suppose in ICU X.X, AA = Å but in ICU Y.Y AA != Å. Further suppose there
> was an RI constraint where the primary key used AA and the foreign key used
> Å. If ICU was updated, the RI constraint between the rows would break,
> leaving an orphaned foreign key.

This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations. So, we use a
strcmp()/memcmp() tie-breaker when strcoll() indicates equality, while
also making the general notion of text equality actually mean binary
equality. In short, we are aware that cases like this exist. IIRC
Unicode Technical Standard #10 independently recommends that this
tie-breaker strategy is one way of dealing with problems like this, in
a pinch, though I think we came up with the idea independently of that
recommendation. This was in response to a bug report over 10 years
ago.

I would like to get case insensitive collations some day, and was
really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-09-07 01:46:06 Re: Let file_fdw access COPY FROM PROGRAM
Previous Message Gerdan Rezende dos Santos 2016-09-07 01:37:47 Re: [PATCH] add option to pg_dumpall to exclude tables from the dump