Quick Links

Re: Pre-proposal: unicode normalized text

From:	Nico Williams <nico(at)cryptonector(dot)com>
To:	Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Pre-proposal: unicode normalized text
Date:	2023-11-02 23:17:33
Message-ID:	ZUQuDS8QN9+rYueh@ubby21
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Oct 17, 2023 at 05:07:40PM +0200, Daniel Verite wrote:
> > * Add a per-database option to enforce only storing assigned unicode
> > code points.
>
> There's a problem in the fact that the set of assigned code points is
> expanding with every Unicode release, which happens about every year.
>
> If we had this option in Postgres 11 released in 2018 it would use
> Unicode 11, and in 2023 this feature would reject thousands of code
> points that have been assigned since then.

Yes, and that's desirable if PG were to normalize text as Jeff proposes,
since then PG wouldn't know how to normalize text containing codepoints
assigned after that. At that point to use those codepoints you'd have
to upgrade PG -- not too unreasonable.

Nico
--

In response to

Re: Pre-proposal: unicode normalized text at 2023-10-17 15:07:40 from Daniel Verite

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nico Williams	2023-11-02 23:23:19	Re: Pre-proposal: unicode normalized text
Previous Message	Nico Williams	2023-11-02 22:54:49	Re: Pre-proposal: unicode normalized text