ts_locale.c: why no t_isalnum() test?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: ts_locale.c: why no t_isalnum() test?
Date: 2022-10-05 19:53:35
Message-ID: 2548310.1664999615@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I happened to wonder why various places are testing things like

#define ISWORDCHR(c) (t_isalpha(c) || t_isdigit(c))

rather than using an isalnum-equivalent test. The direct answer
is that ts_locale.c/.h provides no such test function, which
apparently is because there's not a lot of potential callers in
the core code. However, both pg_trgm and ltree could benefit
from adding one.

There's no semantic hazard here: the documentation I consulted
is all pretty explicit that is[w]alnum is true exactly when
either is[w]alpha or is[w]digit are. For example, POSIX saith

The iswalpha() and iswalpha_l() functions shall test whether wc is a
wide-character code representing a character of class alpha in the
current locale, or in the locale represented by locale, respectively;
see XBD Locale.

The iswdigit() and iswdigit_l() functions shall test whether wc is a
wide-character code representing a character of class digit in the
current locale, or in the locale represented by locale, respectively;
see XBD Locale.

The iswalnum() and iswalnum_l() functions shall test whether wc is a
wide-character code representing a character of class alpha or digit
in the current locale, or in the locale represented by locale,
respectively; see XBD Locale.

While I didn't try to actually measure it, these functions don't
look remarkably cheap. Doing char2wchar() twice when we only need
to do it once seems silly, and the libc functions themselves are
probably none too cheap for multibyte characters either.

Hence, I propose the attached. I got rid of some places that were
unnecessarily checking pg_mblen before applying t_iseq(), too.

regards, tom lane

Attachment Content-Type Size
add-t_isalnum-function-1.patch text/x-diff 5.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-10-05 19:56:24 Re: [PATCH] Expand character set for ltree labels
Previous Message Garen Torikian 2022-10-05 19:34:49 Re: [PATCH] Expand character set for ltree labels