Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Daniel Verite <daniel(at)manitou-mail(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)
Date: 2017-08-07 22:23:56
Message-ID: 5534.1502144636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> On 8/6/17 20:07, Peter Geoghegan wrote:
>> I've looked into this. I'll give an example of what keyword variants
>> there are for Greek, and then discuss what I think each is.

> I'm not sure why we want to get into editorializing this. We query ICU
> for the names of distinct collations and use that. It's more than most
> people need, sure, but it doesn't cost us anything.

Yes, *it does*. The cost will be borne by users who get screwed at update
time, not by developers, but that doesn't make it insignificant.

> The alternatives are hand-maintaining a list of collations, or
> installing no collations by default. Both of those are arguably worse
> for users or for future code maintenance or both.

I'm not (yet) convinced that we need a hand-maintained whitelist. But
I am wondering why we're expending extra code to import keyword variants.
Who is that catering to, really?

The thing that I'm particularly thinking about is that if someone wants
an ICU variant collation that we didn't make initdb provide, they'll do
a CREATE COLLATION and go use it. At update time, pg_dump or pg_upgrade
will export/import that via CREATE COLLATION, and the only way it fails
is if ICU rejects the collation name as garbage. (Which, as we already
established upthread, it's quite unlikely to do.) On the other hand,
if someone relies on an ICU variant collation that initdb did import,
and then in the next release that collation doesn't get imported because
ICU changed their minds on what to advertise, the update situation is not
pretty at all. Certainly it won't get handled transparently. This line
of thinking leads me to believe that we ought to be pretty conservative
about what we import during initdb.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-07 22:27:21 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Peter Geoghegan 2017-08-07 22:21:18 Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)