Re: Built-in CTYPE provider

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, "Davis, Jeff" <jefdavj(at)amazon(dot)com>
Subject: Re: Built-in CTYPE provider
Date: 2023-12-21 23:00:26
Message-ID: d7ff0299ce370c82bef94a8b2425e7c236bc803e.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2023-12-20 at 15:47 -0800, Jeremy Schneider wrote:

> One other thing that comes to mind: how does the parser do case
> folding
> for relation names? Is that using OS-provided libc as of today? Or
> did
> we code it to use ICU if that's the DB default? I'm guessing libc,
> and
> global catalogs probably need to be handled in a consistent manner,
> even
> across different encodings.

The code is in downcase_identifier():

/*
* SQL99 specifies Unicode-aware case normalization, which we don't
* yet have the infrastructure for...
*/
if (ch >= 'A' && ch <= 'Z')
ch += 'a' - 'A';
else if (enc_is_single_byte && IS_HIGHBIT_SET(ch) && isupper(ch))
ch = tolower(ch);
result[i] = (char) ch;

My proposal would add the infrastructure that the comment above says is
missing.

It seems like we should be using the database collation at this point
because you don't want inconsistency between the catalogs and the
parser here. Then again, the SQL spec doesn't seem to support tailoring
of case conversions, so maybe we are avoiding it for that reason? Or
maybe we're avoiding catalog access? Or perhaps the work for ICU just
wasn't done here yet?

> (Kindof related... did you ever see the demo where I create a user
> named
> '🏃' and then I try to connect to a database with non-unicode
> encoding?
> 💥😜  ...at least it seems to be able to walk the index without
> decoding
> strings to find other users - but the way these global catalogs work
> scares me a little bit)

I didn't see that specific demo, but in general we seem to change
between pg_wchar and unicode code points too freely, so I'm not
surprised that something went wrong.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-12-21 23:17:32 Re: broken master regress tests
Previous Message Jeff Davis 2023-12-21 22:24:01 Re: Built-in CTYPE provider