Re: pg_trgm

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: andres(at)anarazel(dot)de, pgsql-hackers(at)postgresql(dot)org, teodor(at)sigaev(dot)ru
Subject: Re: pg_trgm
Date: 2010-05-27 14:15:45
Message-ID: 14655.1274969745@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>> It's not a problem, it's just pilot error, or possibly inadequate
>> documentation. pg_trgm uses the locale's definition of "alpha",
>> "digit", etc. In C locale only basic ASCII letters and digits will be
>> recognized as word constituents.

> That means there is no chance to make pg_trgm work with multibyte + C
> locale? If so, I will leave pg_trgm as it is and provide private
> patches for those who need the functionality.

Exactly what do you consider to be the missing functionality?
You need a notion of word vs non-word character from somewhere,
and the locale setting is the standard place to get that. The
core text search functionality behaves the same way.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2010-05-27 14:20:40 Re: pg_trgm
Previous Message Robert Haas 2010-05-27 14:13:07 Re: Streaming Replication: Checkpoint_segment and wal_keep_segments on standby