Re: Collation version tracking for macOS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Rod Taylor <rbt(at)rbt(dot)ca>, Jim Nasby <nasbyj(at)amazon(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation version tracking for macOS
Date: 2022-06-08 20:01:43
Message-ID: 1738483.1654718503@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Daniel Verite" <daniel(at)manitou-mail(dot)org> writes:
> Independently of these rules, all Unicode collations change frequently
> because each release of Unicode adds new characters. Any string
> that contains a code point that was previously unassigned is going
> to be sorted differently by all collations when that code point gets
> assigned to a character.
> Therefore the versions of all collations need to be bumped at every
> Unicode release. This is what ICU does.

I'm very skeptical of this process as being a reason to push users
to reindex everything in sight. If U+NNNN was not a thing last year,
there's no reason to expect that it appears in anyone's existing data,
and therefore the fact that it sorts differently this year is a poor
excuse for sounding time-to-reindex alarm bells.

I'm quite concerned that we are going to be training users to ignore
collation-change warnings. They have got to be a lot better targeted
than this, or we're just wasting everyone's time, including ours.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-06-08 20:15:47 Re: Request for assistance to backport CVE-2022-1552 fixes to 9.6 and 9.4
Previous Message Peter Geoghegan 2022-06-08 19:44:47 Re: Collation version tracking for macOS