Re: ICU for global collation

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ICU for global collation
Date: 2022-01-10 12:56:56
Message-ID: YdwtGNAAVN79R1Ik@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 10, 2022 at 12:49:07PM +0100, Daniel Verite wrote:
>
> The "daticucol" column also suggests that we don't expect to add
> other collation providers in the future. Maybe a pair of columns like
> (datcollprovider, datcolllocale) would be more future-proof,
> or a (datcollprovider, datcolllocale, datcollisdeterministic)
> triplet if non-deterministic collations are allowed.

I'm not sure about the non-deterministic default collation given the current
restrictions with it, but the extra column seems like a good idea. It would
require a bit more thinking, as we would need a second collation column in
pg_database for any default provider that's not libc.

> Also, pg_collation has "collversion" to detect a mismatch between
> the ICU runtime and existing indexes. I don't see that field
> for the db-wide ICU collation, so maybe we currently miss the capability
> to detect the mismatch for the db-wide collation?

I don't think that storing a version there will really help. There's no
guarantee that any object has been created with the version of the collation
that was installed when the database was created. And we would still need
to store a version with each underlying object anyway, as rebuilding all broken
dependencies can last for a long time, including a server restart.

> The lack of these fields overall suggest the idea that when CREATE
> DATABASE is called with a global ICU collation, what if it somehow
> inserted the collation into pg_collation in the new database?
> Then pg_database would just store the collation oid and no other
> collation-related field would need to be added into it, now
> or in the future.

I don't think it would be doable given the single-database-per-backend
restriction.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-01-10 13:14:37 small bug in ecpg unicode identifier error handling
Previous Message Michael Paquier 2022-01-10 12:48:31 Re: Add jsonlog log_destination for JSON server logs