Skip site navigation (1) Skip section navigation (2)

Re: Can pg_trgm handle non-alphanumeric characters?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: MauMau <maumau307(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Euler Taveira <euler(at)timbira(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Can pg_trgm handle non-alphanumeric characters?
Date: 2012-05-10 19:11:57
Message-ID: 22869.1336677117@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> On Fri, May 11, 2012 at 12:07 AM, MauMau <maumau307(at)gmail(dot)com> wrote:
>> Thanks for your explanation. Although I haven't understood it well yet, I'll
>> consider what you taught. And I'll consider if the tentative measure of
>> removing KEEPONLYALNUM is correct for someone who wants to use pg_trgm
>> against Japanese text.

> In Japanese, it's common to do a text search with two characters keyword.
> But since pg_trgm is 3-gram, you basically would not be able to use index
> for such text search. So you might need something like pg_bigm or pg_unigm
> for Japanese text search.

I believe the trigrams are three *bytes* not three characters.  So a
couple of kanji should work just fine for this.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2012-05-10 19:23:23
Subject: Re: Draft release notes complete
Previous:From: Andrew DunstanDate: 2012-05-10 19:07:33
Subject: Re: Draft release notes complete

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group