Re: lexemes in prefix search going through dictionary modifications

From: Florian Pflug <fgp(at)phlo(dot)org>
To: sushant354(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: lexemes in prefix search going through dictionary modifications
Date: 2011-10-25 16:05:46
Message-ID: 7407A709-87E3-484D-9E7B-CCBAE7187BF9@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Oct25, 2011, at 17:26 , Sushant Sinha wrote:
> I am currently using the prefix search feature in text search. I find
> that the prefix characters are treated the same as a normal lexeme and
> passed through stemming and stopword dictionaries. This seems like a bug
> to me.

Hm, I don't think so. If they don't pass through stopword dictionaries,
then queries containing stopwords will fail to find any rows - which is
probably not what one would expect.

Here's an example:

Query for records containing the* and car*. The @@-operator returns true,
because the stopword is removed from both the tsvector and the tsquery
(the 'english' dictionary drops 'these' as a stopward and stems 'cars' to
'car. Both the tsvector and the query end up being just 'car')

postgres=# select to_tsvector('english', 'these cars') @@ to_tsquery('english', 'the:* & car:*');
?column?
----------
t
(1 row)

Here what happens stopwords aren't removed from the query
(Now, the tsvector ends up being 'car', but the query is 'the:* & car:*')

postgres=# select to_tsvector('english', 'these cars') @@ to_tsquery('simple', 'the:* & car:*');
?column?
----------
f
(1 row)

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2011-10-25 16:12:08 Collect frequency statistics for arrays
Previous Message Sushant Sinha 2011-10-25 15:26:18 lexemes in prefix search going through dictionary modifications