BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd

From: alex(at)hill(dot)net(dot)au
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd
Date: 2013-08-02 07:03:42
Message-ID: E1V59Oo-0007mB-GA@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 8354
Logged by: Alex Hill
Email address: alex(at)hill(dot)net(dot)au
PostgreSQL version: 9.2.4
Operating system: OS X 10.8.4 Mountain Lion
Description:

Hi all,

The docs for ts_rank_cd state:

"This function requires positional information in its input. Therefore it
will not work on "stripped" tsvector values — it will always return zero."

However if a tsvector contains some stripped lexemes and some non-stripped,
ts_rank_cd will rank extents including the non-stripped values.

For example, this evaluates to zero as expected:

SELECT ts_rank_cd(strip(to_tsvector('text search')),
plainto_tsquery('text search'))

But this doesn't:

SELECT ts_rank_cd(to_tsvector('text') || strip(to_tsvector('search')),
plainto_tsquery('text search'))

I think this is a bug, if not in the code then in the documentation, which
isn't clear on what happens when stripped and positioned lexemes are mixed
in one tsvector.

I would prefer that stripped lexemes were completely ignored by ts_rank_cd:
my use case is using this as a fifth pseudo-weight, which matches a @@ query
but doesn't add to a ts_rank_cd ranking.

What do you think?

Cheers,
Alex

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Vik Fearing 2013-08-02 07:05:56 Re: BUG #8352: Using UPPER in ON clause of JOIN
Previous Message mbadolato 2013-08-02 01:35:32 BUG #8353: Core dump with uuid-ossp on FreeBSD 9,2