From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | andres(at)anarazel(dot)de, pgsql-hackers(at)postgresql(dot)org, teodor(at)sigaev(dot)ru |
Subject: | Re: pg_trgm |
Date: | 2010-05-27 14:15:45 |
Message-ID: | 14655.1274969745@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>> It's not a problem, it's just pilot error, or possibly inadequate
>> documentation. pg_trgm uses the locale's definition of "alpha",
>> "digit", etc. In C locale only basic ASCII letters and digits will be
>> recognized as word constituents.
> That means there is no chance to make pg_trgm work with multibyte + C
> locale? If so, I will leave pg_trgm as it is and provide private
> patches for those who need the functionality.
Exactly what do you consider to be the missing functionality?
You need a notion of word vs non-word character from somewhere,
and the locale setting is the standard place to get that. The
core text search functionality behaves the same way.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2010-05-27 14:20:40 | Re: pg_trgm |
Previous Message | Robert Haas | 2010-05-27 14:13:07 | Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby |