Re: Similarity search for sentences

From: "Janek Sendrowski" <janek12(at)web(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Similarity search for sentences
Date: 2013-12-06 14:22:41
Message-ID: trinity-58602a33-a07d-4780-b4ea-83b8285e4906-1386339761095@3capp-webde-bs33
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
thanks for your Answers.
 
@Rémi Cura
You suggest a kind of Full Text Search.  I already had a try with the tsearch2 extension.
The issue is to realize the similarity search. I have to use many OR statements with a low set of arguments.
That significantly slows the FTS down.
 
@Kevin Grittner
I used my own trigger to store the tsvector of the sentences and I created a usual gist Index on them.
What kind of functional Index would you suggest. Like i already told Rémi, I have to to use many OR statements with a low set of arguments, which heavy damages the perfance.
Do you have a better idea?
I usually used a query like this:
 
The tiger is the largest cat species[http://en.wikipedia.org/wiki/Felidae], reaching a total body length of up to 3.3 m  and weighing up to 306 kg.
--------------------------------------------------------------------------------------------------------------------------------------------------
totsvector:
'3.3':16 '306':22 'bodi':11 'cat':6 'kg':23 'largest':5 'length':12 'm':17 'reach':8 'speci':7 'tiger':2 'total':10 'weigh':19
(1 row)
 
SELECT * FROM tablename WHERE vector @@ to_tsquery('speci & tiger & total & weigh') AND vector @@ to_tsquery('largest & length & m & reach')  AND vector @@ to_tsquery('3.3 & 306 & bodi & cat & kg');

And thats very slow
 
I didn't know that the pg_trgm Module provides KNN search.
 
Janek Sendrowski
 
 
 

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Janek Sendrowski 2013-12-06 14:30:57 Similarity search with the tsearch2 extension
Previous Message vincent elschot 2013-12-06 11:04:32 Re: postgresql or xquery?