Re: Unicode Normalization

From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pg1(at)thetdh(dot)com, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unicode Normalization
Date: 2009-09-24 16:05:58
Message-ID: 233B7C57-2096-4C9E-9704-14D1EF2164B4@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sep 24, 2009, at 8:59 AM, Andrew Dunstan wrote:

>> That might be nice, but I'd be wary of a geometric multiplication
>> of text types. We already have TEXT and CITEXT; what if we had your
>> NTEXT (normalized text) but I wanted it to also be case-insensitive?
>
> Actually, I don't think it's necessarily a good idea at all. If a
> user inputs a perfectly valid piece of UTF8 text, we should be able
> to give it back to them exactly, whether or not it's in normalized
> form. The normalized forms are useful for certain comparison
> purposes, but they don't affect the validity of the text. CITEXT
> doesn't mangle what is stored, just how it's compared.

Right, I don't think there's a need for a normalized TEXT type.

Best,

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-09-24 18:42:32 Re: [rfc] unicode escapes for extended strings
Previous Message Andrew Dunstan 2009-09-24 15:59:09 Re: Unicode Normalization