| From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
|---|---|
| To: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Remaining dependency on setlocale() |
| Date: | 2025-12-23 20:09:08 |
| Message-ID: | ac83c2376d4ce2ad423f48c4fe2c416fc5d2b346.camel@j-davis.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, 2025-12-17 at 11:39 +0100, Peter Eisentraut wrote:
> For Metaphone, I found the reference implementation linked from its
> Wikipedia page, and it looks like our implementation is pretty
> closely
> aligned to that. That reference implementation also contains the
> C-with-cedilla case explicitly. The correct fix here would probably
> be
> to change the implementation to work on wide characters. But I think
> for the moment you could try a shortcut like, use pg_ascii_toupper(),
> but if the encoding is LATIN1 (or LATIN9 or whichever other encodings
> also contain C-with-cedilla at that code point), then explicitly
> uppercase that one as well. This would preserve the existing
> behavior.
Done, attached new patches.
Interestingly, WIN1256 encodes only the SMALL LETTER C WITH CEDILLA. I
think, for the purposes here, we can still consider it to "uppercase"
to \xc7, so that it can still be treated as the same sound. Technically
I think that would be an improvement over the current code in this edge
case, and suggests that case folding would be a better approach than
uppercasing.
Regards,
Jeff Davis
| Attachment | Content-Type | Size |
|---|---|---|
| v13-0001-fuzzystrmatch-use-pg_ascii_toupper.patch | text/x-patch | 6.7 KB |
| v13-0002-Control-LC_COLLATE-with-GUC.patch | text/x-patch | 7.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-12-23 20:21:10 | Re: NLS: use gettext() to translate system error messages |
| Previous Message | Tom Lane | 2025-12-23 20:07:07 | Re: NLS: use gettext() to translate system error messages |