Quick Links

Re: Pre-proposal: unicode normalized text

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Pre-proposal: unicode normalized text
Date:	2023-10-11 07:37:46
Message-ID:	5661a3b1cd8cf046d6b761c1bcf4eb82cb58397d.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote:
> On 11.10.23 03:08, Jeff Davis wrote:
> > * unicode_is_valid(text): returns true if all codepoints are
> > assigned, false otherwise
>
> We need to be careful about precise terminology. "Valid" has a
> defined
> meaning for Unicode. A byte sequence can be valid or not as UTF-8.
> But
> a string containing unassigned code points is not not-"valid" as
> Unicode.

Agreed. Perhaps "unicode_assigned()" is better?

> > * unicode_version(): version of unicode Postgres is built with
> > * icu_unicode_version(): version of Unicode ICU is built with
>
> This seems easy enough, but it's not clear what users would actually
> do
> with that.

Just there to make it visible. If it affects the semantics (which it
does currently for normalization) it seems wise to have some way to
access the version.

Regards,
Jeff Davis

In response to

Re: Pre-proposal: unicode normalized text at 2023-10-11 06:56:13 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2023-10-11 07:43:08	Re: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message	Mingyu Li	2023-10-11 07:34:27	Re: [PoC] run SQL over ciphertext