From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | magicagent(at)gmail(dot)com |
Subject: | BUG #17569: false negative / positive results when using <-> (followed by) and tsvector limit (16383) hit |
Date: | 2022-08-03 17:51:10 |
Message-ID: | 17569-47d11a72a38bf8ae@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 17569
Logged by: Alex Malek
Email address: magicagent(at)gmail(dot)com
PostgreSQL version: 14.4
Operating system: Red Hat
Description:
It is well documented that "Position values in tsvector must be greater than
0 and no more than 16,383"
However these limits can result in false positive or false negative search
results
doing a FOLLOWED BY / phrase search in a document w/ more than 16,383
words.
The false negative seems particularly bad / unexpected.
The false positive results happen when a word is at or before before
position 16,382, then every word at or past position 16,383 appears to be at
16,383
SELECT tq, text, text @@ tq AS ok, repeat(' foo ',16381) || text @@ tq AS
false_pos
FROM (VALUES( websearch_to_tsquery('"red cat"'), 'red dogs chase with black
cats' )) t(tq, text) ;
tq | text | ok | false_pos
-----------------+--------------------------------+----+-----------
'red' <-> 'cat' | red dogs chase with black cats | f | t
(1 row)
The false negative happens for any phrase that exists at or after position
16,383 since all words appear to be at 16,383
# SELECT tq, text, text @@ tq AS small, repeat(' foo ',16381) || text @@ tq
AS false_neg
FROM (VALUES( websearch_to_tsquery('"black cat"'), 'red dogs chase with
black cats' )) t(tq, text) ;
tq | text | small | false_neg
-------------------+--------------------------------+-------+-----------
'black' <-> 'cat' | red dogs chase with black cats | t | f
(1 row)
From | Date | Subject | |
---|---|---|---|
Next Message | Alex Malek | 2022-08-03 18:02:51 | Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly |
Previous Message | Tom Lane | 2022-08-03 14:56:13 | Re: BUG #17564: Planner bug in combination of generate_series(), unnest() and ORDER BY |