Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries
Date: 2013-05-31 00:19:58
Message-ID: CA+HiwqFkeoyAxsCqEoQOf8UcQ68u7i4J5jZQBSsSrX44Kyh+WQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 31, 2013 at 4:25 AM, Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> On Thu, May 30, 2013 at 12:49 PM, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
>>
>> following emails are discussed about partial match of pg_trgm. I hope
>> will this help.
>>
>> <http://www.postgresql.org/message-id/CAHGQGwFJshvV2nGME19wdTW9teFw_w7h2ns4E+YYsjkB9WdWDQ@mail.gmail.com>
>> as you may know, if search string contains multibyte characters
>> trigram key is converted to CRC of 4 byte and it is used as key.
>> (but only use upper 3 byte from CRC)
>> so we can do partial matching if KEEPONLYALNUM is enabled.
>
>
> Please, read the further discussion on that thread. We can't do partial
> matching because of CRC independently of KEEPONLYALNUM.
>

Thank you Sawada-san and Alexander.

I think the idea of using trigram "text" itself rather than its CRC
(due to its problems in partial matching) as GIN key (?) has not been
implemented into pg_trgm yet, right? And even though, such a facility
would be added, we would still need to handle multibyte characters
case differently (even for partial matching), is that right?

--
Amit Langote

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-05-31 02:04:23 Re: Freezing without write I/O
Previous Message Bruce Momjian 2013-05-30 21:06:11 Re: Freezing without write I/O