Re: [PATCH] Expand character set for ltree labels

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Garen Torikian <gjtorikian(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Expand character set for ltree labels
Date: 2022-10-05 18:59:01
Message-ID: 2438294.1664996341@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Garen Torikian <gjtorikian(at)gmail(dot)com> writes:
> I am submitting a patch to expand the label requirements for ltree.

> The current format is restricted to alphanumeric characters, plus _.
> Unfortunately, for non-English labels, this set is insufficient.

Hm? Perhaps the docs are a bit unclear about that, but it's not
restricted to ASCII alphanumerics. AFAICS the code will accept
whatever iswalpha() and iswdigit() will accept in the database's
default locale. There's certainly work that could/should be done
to allow use of not-so-default locales, but that's not specific
to ltree. I'm not sure that doing an application-side encoding
is attractive compared to just using that ability directly.

If you do want to do application-side encoding, I'm unsure why
punycode would be the choice anyway, as opposed to something
that can fit in the existing restrictions.

> On top of this, I added support for two more characters: # and ;, which are
> used for HTML entities.

That seems really pretty random.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-10-05 19:00:55 Re: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records
Previous Message Nathan Bossart 2022-10-05 18:57:16 Re: archive modules