Quick Links

Re: Unicode Normalization

From:	"David E(dot) Wheeler" <david(at)kineticode(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	pg1(at)thetdh(dot)com, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Unicode Normalization
Date:	2009-09-24 16:05:58
Message-ID:	233B7C57-2096-4C9E-9704-14D1EF2164B4@kineticode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sep 24, 2009, at 8:59 AM, Andrew Dunstan wrote:

>> That might be nice, but I'd be wary of a geometric multiplication
>> of text types. We already have TEXT and CITEXT; what if we had your
>> NTEXT (normalized text) but I wanted it to also be case-insensitive?
>
> Actually, I don't think it's necessarily a good idea at all. If a
> user inputs a perfectly valid piece of UTF8 text, we should be able
> to give it back to them exactly, whether or not it's in normalized
> form. The normalized forms are useful for certain comparison
> purposes, but they don't affect the validity of the text. CITEXT
> doesn't mangle what is stored, just how it's compared.

Right, I don't think there's a need for a normalized TEXT type.

Best,

David

In response to

Re: Unicode Normalization at 2009-09-24 15:59:09 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2009-09-24 18:42:32	Re: [rfc] unicode escapes for extended strings
Previous Message	Andrew Dunstan	2009-09-24 15:59:09	Re: Unicode Normalization