Quick Links

Re: Built-in case-insensitive collation pg_unicode_ci

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Built-in case-insensitive collation pg_unicode_ci
Date:	2025-09-21 19:18:36
Message-ID:	6006986a1f0e55f970eddd94adc3b07c2d594e5e.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2025-09-19 at 17:21 -0700, Jeff Davis wrote:

> ----------
> Versioning
> ----------
>
> Unlike other built-in collations, the order does depend on the
> version
> of Unicode...
> That means that indexes, including primary keys, can become
> inconsistent after a major version upgrade...

There's another option here: we can have the PG_UNICODE_CI collation
throw an error when the comparison involves unassigned code points.
That would give us assurance that primary keys remain consistent across
upgrades.

While not every user would want that for their entire database, I think
it's a good idea in the case of PG_UNICODE_CI:

* It would ensure that all primary keys using any builtin collation
are stable across upgrades.
* If the data is somewhere else, like an unindexed column or an index
with a different collation, then unassigned code points would still be
just fine.
* The cases where you'd want to use the PG_UNICODE_CI collation are
also the cases where it's not so important to permit very-recently-
assigned code points.
* Applications already need to expect errors when inserting into a
primary key or unique index, so it wouldn't require rewriting
applications to handle such errors.

Regards,
Jeff Davis

In response to

Built-in case-insensitive collation pg_unicode_ci at 2025-09-20 00:21:34 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jelte Fennema-Nio	2025-09-21 19:18:44	Re: We broke the defense against accessing other sessions' temp tables
Previous Message	Bruce Momjian	2025-09-21 17:45:50	Re: We broke the defense against accessing other sessions' temp tables