Re: pg_trgm version 1.2

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_trgm version 1.2
Date: 2015-06-29 12:23:43
Message-ID: CAHyXU0wzXikiKEe6Ffrp=qXRq3+jA7q+LeFr0HZoi4XSu4A+BA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 27, 2015 at 5:17 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> This patch implements version 1.2 of contrib module pg_trgm.
>
> This supports the triconsistent function, introduced in version 9.4 of the
> server, to make it faster to implement indexed queries where some keys are
> common and some are rare.
>
> I've included the paths to both upgrade and downgrade between 1.1 and 1.2,
> although after doing so you must close and restart the session before you
> can be sure the change has taken effect. There is no change to the on-disk
> index structure
>
> This shows the difference it can make in some cases:
>
> create extension pg_trgm version "1.1";
>
> create table foo as select
>
> md5(random()::text)|| case when random()<0.000005 then 'lmnop' else '123'
> end ||
>
> md5(random()::text) as bar
>
> from generate_series(1,10000000);
>
> create index on foo using gin (bar gin_trgm_ops);
>
> --some queries
>
> alter extension pg_trgm update to "1.2";
>
> --close, reopen, more queries
>
>
> select count(*) from foo where bar like '%12344321lmnabcddd%';
>
>
>
> V1.1: Time: 1743.691 ms --- after repeated execution to warm the cache
>
> V1.2: Time: 2.839 ms --- after repeated execution to warm the cache

Wow! I'm going to test this. I have some data sets for which trigram
searching isn't really practical...if the search string touches
trigrams with a lot of duplication the algorithm can have trouble
beating brute force searches.

trigram searching is important: it's the only way currently to search
string encoded structures for partial strings quickly.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2015-06-29 12:45:38 Re: proposal: condition blocks in psql
Previous Message Andres Freund 2015-06-29 12:00:39 Re: Rework the way multixact truncations work