From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Haotian Yang <yangnw(at)live(dot)com> |
Cc: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: `pg_trgm` not recognizing Chinese characters in macOS |
Date: | 2018-09-11 13:20:13 |
Message-ID: | 18165.1536672013@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Haotian Yang <yangnw(at)live(dot)com> writes:
> Versions: macOS 10.13.6, PostgreSQL 10.5, pg_trgm 1.3.
> LC_ALL=en_US.UTF-8
pg_trgm relies on libc's functions (specifically, iswalpha()) to determine
what is a word character or not. Unfortunately, the UTF8 locale support
in macOS is pretty incomplete, and I don't find it too surprising that
it's not recognizing Chinese characters as alphabetic. Now, you could
make a good argument that they *shouldn't* be considered alphabetic in
an en_US locale; but I'm unsure whether switching to a more appropriate
locale will help.
Anyway, I'd first try zh_CN.UTF-8, and if that doesn't fix it, the place
to complain is https://bugreport.apple.com/ ... I'm sure they know about
it already, but the number of reports has an impact on how fast they
fix things.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2018-09-11 14:39:36 | Re: BUG #15378: SP-GIST memory context screwup? |
Previous Message | Andrew Gierth | 2018-09-11 05:30:35 | Re: BUG #15379: Release process of the index access method is not called when an error occurs. |