Re: Pre-proposal: unicode normalized text

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Pre-proposal: unicode normalized text
Date: 2023-10-11 06:56:13
Message-ID: 6c78e772-0c1b-4a8f-ab14-d673a086d35e@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11.10.23 03:08, Jeff Davis wrote:
> * unicode_is_valid(text): returns true if all codepoints are
> assigned, false otherwise

We need to be careful about precise terminology. "Valid" has a defined
meaning for Unicode. A byte sequence can be valid or not as UTF-8. But
a string containing unassigned code points is not not-"valid" as Unicode.

> * unicode_version(): version of unicode Postgres is built with
> * icu_unicode_version(): version of Unicode ICU is built with

This seems easy enough, but it's not clear what users would actually do
with that.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mingyu Li 2023-10-11 07:04:55 Re: [PoC] run SQL over ciphertext
Previous Message Peter Eisentraut 2023-10-11 06:51:27 Re: Pre-proposal: unicode normalized text