Re: Fastest Index/Algorithm to find similar sentences

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Janek Sendrowski <janek12(at)web(dot)de>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fastest Index/Algorithm to find similar sentences
Date: 2013-07-26 05:58:51
Message-ID: CA+HiwqGXXsX1OdZKv7m4241GiyYg4bDU4rXtaDCW-Ac36ab7ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jul 26, 2013 at 7:54 AM, Janek Sendrowski <janek12(at)web(dot)de> wrote:
> Hi,
>
> I'm searching for an algorithm/Index to find similar sentences in a database.
>
> The Fulltextsearch is not really suitable because it doesn't have a tolerance.
>
> The Levenshtein-distance ist to slow.
>
> I also tried pg_trgm module, which works with tri-grams, but it's also very slow with 100.000+ rows.
>
> I hope someone can help, I can't really find sth. which is fast enough.
>

Have you tried pg_bigm (a bi-gram based implementation)? It's still in
development phase, but you could give it a try and see if it can
perform better where pg_trgm can not.

--
Amit Langote

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Samrat Revagade 2013-07-26 06:00:10 Re: Speed up Switchover
Previous Message John R Pierce 2013-07-26 05:36:30 Re: Tablespace on Postgrsql