Re: Hostnames, IDNs, Punycode and Unicode Case Folding

From: Mike Cardwell <pgsql(at)lists(dot)grepular(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Date: 2014-12-30 00:18:58
Message-ID: 20141230001858.GA24297@glue.grepular.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

* on the Mon, Dec 29, 2014 at 07:00:05PM -0500, Andrew Sullivan wrote:

>> CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname)));
>
> This wouldn't work to get the original back if oyu have any IDNA200
> data, because puncode-encoding the UTF-8 under IDNA2003 and the
> punycode-decoding it doesn't always result in the same label. See my
> other message.

The original is the thing that is stored in the database. I wouldn't need to
do any conversion to get the original back. In my example I am storing
the original and creating an index on the punycode version.

This is exactly the same method that we commonly use for performing case
insensitive text searches using lower() indexes.

--
Mike Cardwell https://grepular.com https://emailprivacytester.com
OpenPGP Key 35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F
XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Sullivan 2014-12-30 00:22:21 Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Previous Message Adrian Klaver 2014-12-30 00:09:16 Re: Rollback on include error in psql