Re: What users can do with custom ICU collations in Postgres 10

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What users can do with custom ICU collations in Postgres 10
Date: 2017-08-15 19:04:37
Message-ID: CAH2-Wz=jpzppx1Pe-R7Vu42ybHZBh4ih+QMZ2-mxgE=Gzk5b-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 15, 2017 at 11:33 AM, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> On 8/9/17 18:49, Peter Geoghegan wrote:
>> I'd like to give a demo on what is already possible, but not currently
>> documented. I didn't see anyone else comment on this, including Peter
>> E (maybe I missed that?). We should improve the documentation in this
>> area, to get this into the hands of users.
>
> Here is a small piece of documentation. Thoughts?

This looks pretty good, but I do have some feedback:

* "23.2.2.3. Copying Collations" suggests that the only use of CREATE
COLLATION is copying collations, which is far from true with ICU. We
should change that at the same time as this change is made. I think
that just changing the title would improve the overall flow of the
page.

* Maybe add an example of numeric ordering -- the "alphanumeric
invoice" case, where you want text containing numbers to have the
numbers sort as numbers iff the comparison is to be resolved when
comparing numbers. I think that that's really useful, and worth
specifically calling out. I definitely would have used that had it
been available ten years ago.

* Let's use "en-u-kr-others-digit" instead of "en-u-kr-latn-digit' in
the example. It makes no real difference to us English speakers, but
means that the example works the same for those that use a different
alphabet. It's more culturally neutral.

* If we end up having initdb put all locales rather than all
collations in pg_collation, which I think is very likely, then we can
put in a link to ICU's locale explorer web resource:

https://ssl.icu-project.org/icu-bin/locexp?d_=en&_=en_HK

This lets the user see exactly what they'll get from a base locale
using an intuitive interface (assuming it matches their CLDR version).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-08-15 19:15:39 Re: [COMMITTERS] pgsql: Simplify plpgsql's check for simple expressions.
Previous Message Peter Eisentraut 2017-08-15 18:47:58 Re: PDF content lemma subdivision