From: | Garen Torikian <gjtorikian(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: [PATCH] Expand character set for ltree labels |
Date: | 2022-10-04 23:16:30 |
Message-ID: | CAGXsc+-jhKJvSaqTWYa_PkrmA0ANWPfpte41ijwYpUjx7GNrHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
No, not quite.
Valid Punycode characters are `[A-Za-z0-9-]`. This proposal includes `-`,
as well as `#` and `;` for HTML entities.
I double-checked the RFC to see the valid Punycode characters and the set
above is indeed correct:
https://datatracker.ietf.org/doc/html/draft-ietf-idn-punycode-02#section-5
While it would be nice for ltree labels to support *any* printable
character, it can't because symbols like `!` and `%` already have special
meaning in the querying. This proposal leaves those as is and does not
depend on any existing special character.
On Tue, Oct 4, 2022 at 6:32 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:
> On Tue, Oct 04, 2022 at 12:54:46PM -0400, Garen Torikian wrote:
> > The punycode range of characters is the exact same set as the existing
> > ltree range, with the addition of a hyphen (-). Within this system, any
> > human language can be encoded using just A-Za-z0-9-.
>
> IIUC ASCII characters like '!' and '<' are valid Punycode characters, but
> even with your proposal, those wouldn't be allowed.
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-10-04 23:49:52 | Re: problems with making relfilenodes 56-bits |
Previous Message | Nathan Bossart | 2022-10-04 22:54:20 | Re: Move backup-related code to xlogbackup.c/.h |