From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Subject: | Re: ICU for global collation |
Date: | 2021-12-30 12:07:21 |
Message-ID: | 525ef44f-52bf-505f-a491-07835d039424@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
There were a few inquiries about this topic recently, so I dug up the
old thread and patch. What we got stuck on last time was that we can't
just swap out all locale support in a database for ICU. We still need
to set the usual locale environment, otherwise some things that are not
ICU aware will break or degrade. I had initially anticipated fixing
that by converting everything that uses libc locales to ICU. But that
turned out to be tedious and ultimately not very useful as far as the
user-facing result is concerned, so I gave up.
So this is a different approach: If you choose ICU as the default locale
for a database, you still need to specify lc_ctype and lc_collate
settings, as before. Unlike in the previous patch, where the ICU
collation name was written in datcollate, there is now a third column
(daticucoll), so we can store all three values. This fixes the
described problem. Other than that, once you get all the initial
settings right, it basically just works: The places that have ICU
support now will use a database-wide ICU collation if appropriate, the
places that don't have ICU support continue to use the global libc
locale settings.
I changed the datcollate, datctype, and the new daticucoll fields to
type text (from name). That way, the daticucoll field can be set to
null if it's not applicable. Also, the limit of 63 characters can
actually be a problem if you want to use some combination of the options
that ICU locales offer. And for less extreme uses, having
variable-length fields will save some storage, since typical locale
names are much shorter.
For the same reasons and to keep things consistent, I also changed the
analogous pg_collation fields like that. This also removes some weird
code that has to check that colcollate and colctype have to be the same
for ICU, so it's overall cleaner.
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Add-option-to-use-ICU-as-global-collation-provide.patch | text/plain | 69.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Maxim Orlov | 2021-12-30 12:15:16 | Add 64-bit XIDs into PostgreSQL 15 |
Previous Message | Maxim Orlov | 2021-12-30 11:51:10 | Re: Pre-allocating WAL files |