F.19. icu_multilib

F.19.1. ICU Version Stability
F.19.2. Multiple ICU Version Support
F.19.3. Configuration Parameters
F.19.4. Functions
F.19.5. Examples
F.19.6. Authors

The icu_multilib module provides control over the version (or versions) of the ICU provider library used by PostgreSQL, which can be different from the version of ICU with which it was built.

Collations are a product of natural language, and natural language evolves over time; but PostgreSQL depends on stable ordering for structures such as indexes. Newer versions of ICU update the provided collators to adapt to changes in natural language, so it's important to control when and how those new versions of ICU are used to prevent problems such as index corruption.

This module assumes that the necessary versions of ICU are already available, such as through the operating system's package manager; and already properly installed in a single location accessible to PostgreSQL. The configration variable icu_multilib.library_path should be set to the location where these ICU library versions are installed.

icu_multilib must be loaded via shared_preload_libraries. icu_multilib ignores any ICU library with a major version greater than that with which PostgreSQL was built.

F.19.1. ICU Version Stability

The simplest way to use icu_multilib is to force ICU version stability. Version stability means that PostgreSQL will use the same version of ICU for all collations, even if the operating system or PostgreSQL itself is upgraded or moved. The stable version may or may not be the same as that with which PostgreSQL was built.

Version stability may be based on both the major and minor version of ICU, such as 68.2; or on the major version alone, such as 68. Specifying both the major and minor versions more closely controls exactly which ICU library is loaded, but applying minor version upgrades of the ICU library may interfere with normal operation. Specifying the major version alone makes applying ICU minor version updates simpler, but carries a risk of subtle differences in collation order if the library is updated.

To configure for version stability, first make sure that icu_multilib.library_path is set correctly, which you can observe by executing the icu_multilib.library_versions() function. Then set icu_multilib.default_icu_version to the stable version, and set icu_multilib.search_by_collator_version to false. These configuration variables should be set in postgresql.conf to ensure that the stable version of ICU is used consistently.

Note

Ensure that the stable version of the ICU library is always available, including on all replicas and through operating system and PostgreSQL upgrades.

Similarly, keep the settings in postgresql.conf, and keep them consistent across replicas and upgrades.

For best results, stabilize the ICU version immediately after initdb is run, to ensure that initial collations are loaded from the stabilized version of the ICU library. To do this, use the option --no-import-collations for initdb, import the collations manually, and then refresh the versions for the default collations. For instance:

$ initdb --no-import-collations -D data --locale-provider=icu --icu-locale="en_US" --locale="en_US.UTF-8"
$ cat - >> data/postgresql.conf
shared_preload_libraries = 'icu_multilib'
icu_multilib.library_path = '/path/to/icu/lib'
icu_multilib.default_icu_version = '65.1'
icu_multilib.search_by_collator_version = false
icu_multilib.include_builtin = false
^D
$ pg_ctl -D data -l logfile start
$ psql template1
WARNING:  icu_multilib: collator version mismatch detected for locale "en_US"
DETAIL:  ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112".
WARNING:  icu_multilib: collator version mismatch detected for locale "en_US"
DETAIL:  ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112".
WARNING:  database "template1" has a collation version mismatch
DETAIL:  The database was created using collation version 153.112, but the operating system provides version 153.97.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE template1 REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
=# ALTER DATABASE template1 REFRESH COLLATION VERSION;
NOTICE:  changing version from 153.112 to 153.97
ALTER DATABASE
=# SELECT pg_import_system_collations('pg_catalog');
 pg_import_system_collations
-----------------------------
                         792
(1 row)

=# \q
$ psql postgres
WARNING:  icu_multilib: collator version mismatch detected for locale "en_US"
DETAIL:  ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112".
WARNING:  icu_multilib: collator version mismatch detected for locale "en_US"
DETAIL:  ICU 65.1 provides collator version "153.97" for locale "en_US"; expected version "153.112".
WARNING:  database "postgres" has a collation version mismatch
DETAIL:  The database was created using collation version 153.112, but the operating system provides version 153.97.
HINT:  Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
=# ALTER DATABASE postgres REFRESH COLLATION VERSION;
NOTICE:  changing version from 153.112 to 153.97
ALTER DATABASE
=# SELECT pg_import_system_collations('pg_catalog');
 pg_import_system_collations
-----------------------------
                         792
(1 row)

=# \q
$

The result will be a system based on ICU version 65.1, and PostgreSQL will use that version of ICU for all collations that use the ICU provider.

F.19.2. Multiple ICU Version Support

Support for multiple ICU library versions in the same database allows the system to adapt to changes in natural language over time. As these changes are introduced in new versions of ICU, those new versions can be brought into PostgreSQL incrementally without interfering with existing data (or structures like indexes).

F.19.2.1. Search by Collator Version

Collators provided by ICU also have a version which is distinct from the version of ICU. For instance, in ICU version 70.1, the collator for the en_US locale has version 153.112. A collator for a given locale may or may not be assigned a new version when the ICU library version is updated.

When a collation is created, PostgreSQL obtains the version of the collator from the ICU library and records it in the catalog. This recorded version can be updated using ALTER COLLATION ... REFRESH VERSION; or ALTER DATABASE ... REFRESH COLLATION VERSION if it's the database's default collation. These ALTER commands obtain the new collator version from the ICU library identified by icu_multilib.default_icu_version if set; otherwise they obtain the new collator version from the built-in ICU library.

Set icu_multilib.search_by_collator_version to true to enable icu_multilib to search (in descending order of ICU major version) among the available libraries to find the first one that provides a collator with a version that matches the one recorded in the catalog.

For example, ICU versions 62.2 and 63.2 both provide collator version 153.88 for the locale en_US; while ICU version 64.2 provides collator version 153.97 for the same locale. Searching for an ICU library for a collation with a recorded version of 153.88 will find the ICU library with version 63.2, because that's the ICU library with the highest major version (63) that provides collator version 153.88. The function icu_multilib.collator_version_search simulates this search for the given locale and version:

SELECT * FROM icu_multilib.collator_version_search('en_US', '153.88');
 icu_version | uca_version | collator_version
-------------+-------------+------------------
 63.2        | 11.0        | 153.88
(1 row)

Note

Searching by collator version relies on the assumption that collators with the same version behave identically regardless of which ICU library provides it. The administrator must weigh the risks of this assumption against the convenience when managing multiple major versions of ICU.

F.19.3. Configuration Parameters

icu_multilib.library_path (string)

The filesystem path in which to search for ICU libraries. Must be set for icu_multilib to be useful. Multiple major versions of ICU may be present; but for each major version, only a single minor version can be present.

icu_multilib.version_mismatch_log_level (string)

A version mismatch happens when the version of a collator provided by the ICU library is different than then collator version recorded when the collation was created (or last refreshed).

When a version mismatch is detected, a log message is emitted at this level. WARNING is the default. If the level is set to ERROR, it may interfere with normal operation.

icu_multilib.library_search_log_level (string)

icu_multilib emits messages at this level while determining which ICU library to choose. The default is DEBUG1. This setting is useful to diagnose problems when a suitable ICU library is not found.

icu_multilib.default_icu_version (string)

Identifies the version of the ICU library to choose; or, if icu_multilib.search_by_collator_version is true, the ICU library to choose if no identical match to the collator version is found. The default is the empty string, meaning to use the built-in ICU library.

Valid values are either of the form major.minor such as 70.1; or a major version number alone, such as 70.

icu_multilib.include_builtin (string)

Determines whether the built-in version of ICU is included when searching for a suitable ICU library. The default is true.

Even if set to false, the built-in ICU library may be used if no other suitable ICU library is identified and icu_multilib.default_icu_version is set to the empty string.

icu_multilib.search_by_collator_version (string)

Causes icu_multilib to identify the ICU library with the highest major version that offers the required collator with an exactly-matching collator version. The default is false.

When set to true, icu_multilib relies on equal collator versions to produce identical collation order.

If no ICU library is found with an exactly-matching collator version, icu_multilib will fall back to the icu_multilib.default_icu_version; or if not set, to the built-in ICU library.

F.19.4. Functions

Table F.9. icu_multilib Functions

Function

Description

icu_multilib.library_versions () → setof record ( icu_version text, unicode_version text, cldr_version text, libicui18n_name text, libicuuc_name text)

Returns details for each available ICU library found in icu_multilib.library_path. Also includes the built-in ICU library iv icu_multilib.include_builtin is true.

icu_multilib.library_collators ( major_version text DEFAULT null ) → setof record ( locale text, icu_version text, uca_version text, collator_version text )

Returns details for all available collators provided by the ICU library with the given major_version; or by the built-in ICU library if major_version is null.

icu_multilib.collator_version_search ( locale text, requested_version text DEFAULT null, log_ok boolean DEFAULT false ) → record ( icu_version text, uca_version text, collator_version text )

Performs a search for the appropriate ICU library given the locale name and requested collator version, and returns details about the ICU library.

If requested_version is null, it will return the ICU library identified by icu_multilib.default_icu_version if set to true; otherwise the built-in ICU library.

If log_ok is true, the search may emit log messages at the level icu_multilib.library_search_log_level, which may be useful for diagnosing misconfiguration.

icu_multilib.collator_versions ( locale text ) → setof record ( icu_version text, uca_version text, collator_version text )

Returns details about available collators for the given locale from all available ICU libraries.


F.19.5. Examples

SELECT icu_version, unicode_version, cldr_version FROM icu_multilib.library_versions() LIMIT 3;
 icu_version | unicode_version | cldr_version
-------------+-----------------+--------------
 70.1        | 14.0            | 40.0
 69.1        | 13.0            | 39.0
 68.2        | 13.0            | 38.1
(3 rows)
SELECT * FROM icu_multilib.collator_versions('en_US') WHERE icu_version BETWEEN '60.0' and '65.0';
 icu_version | uca_version | collator_version
-------------+-------------+------------------
 64.2        | 12.1        | 153.97
 63.2        | 11.0        | 153.88
 62.2        | 11.0        | 153.88
 61.2        | 10.0        | 153.80
 60.3        | 10.0        | 153.80
(5 rows)
SELECT * FROM icu_multilib.collator_version_search('en_US', '153.97');
 icu_version | uca_version | collator_version
-------------+-------------+------------------
 65.1        | 12.1        | 153.97
(1 row)

F.19.6. Authors

Jeff Davis , Thomas Munro