Re: ICU integration

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU integration
Date: 2017-02-16 05:10:33
Message-ID: ce46c712-4776-c5e3-9121-864673522f92@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Updated and rebased patch.

Significant changes:

- Changed collversion to type text

- Changed pg_locale_t to a union

- Use ucol_getAvailable() instead of uloc_getAvailable(), so the set of
initial collations is smaller now, because redundancies are eliminated.

- Added keyword variants to predefined ICU collations (so you get
"de_phonebook%icu", for example) (So the initial set of collations is
bigger now. :) )

- Predefined ICU collations have a comment now, so \dOS+ is useful.

- Use ucol_nextSortKeyPart() for abbreviated keys

- Enhanced tests and documentation

I believe all issues raised in reviews have been addressed.

Discussion points:

- Naming of collations: Are we happy with the "de%icu" naming? I might
have come up with that while reviewing the IPv6 zone index patch. ;-)
An alternative might be "de$icu" for more Oracle vibe and avoiding the
need for double quotes in some cases. (But we have mixed-case names
like "de_AT%icu", so further changes would be necessary to fully get rid
of the need for quoting.) A more radical alternative would be to
install ICU locales in a different schema and use the search_path, or
even have a separate search path setting for collations only. Which
leads to ...

- Selecting default collation provider: Maybe we want a setting, say in
initdb, to determine which provider's collations get the "good" names?
Maybe not necessary for this release, but something to think about.

- Currently (in this patch), we check a collation's version when it is
first used. But, say, after pg_upgrade, you might want to check all of
them right away. What might be a good interface for that? (Possibly,
we only have to check the ones actually in use, and we have dependency
information for that.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v4-0001-ICU-support.patch text/x-patch 153.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-02-16 05:17:53 Re: ICU integration
Previous Message Joel Jacobson 2017-02-16 04:26:27 case_preservation_and_insensitivity = on