BUG #12205: Getting wrong results from full text search

From: prasanna(at)semantifi(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #12205: Getting wrong results from full text search
Date: 2014-12-11 14:13:51
Message-ID: 20141211141351.2526.54502@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 12205
Logged by: prasanna kumar kuppa
Email address: prasanna(at)semantifi(dot)com
PostgreSQL version: 9.1.14
Operating system: windows
Description:

Hi,

This may table structure:
CREATE TABLE semantified_content_key_word
(
id bigint NOT NULL,
semantified_content_id bigint,
key_word_text text,
content_date timestamp without time zone NOT NULL,
context_id bigint NOT NULL,
CONSTRAINT pk_sckw_id PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE semantified_content_key_word
OWNER TO postgres;

-- Index: idx_sckw_cd

-- DROP INDEX idx_sckw_cd;

CREATE INDEX idx_sckw_cd
ON semantified_content_key_word
USING btree
(content_date );

-- Index: idx_sckw_ctx_id

-- DROP INDEX idx_sckw_ctx_id;

CREATE INDEX idx_sckw_ctx_id
ON semantified_content_key_word
USING btree
(context_id );

-- Index: idx_sckw_kwt

-- DROP INDEX idx_sckw_kwt;

CREATE INDEX idx_sckw_kwt
ON semantified_content_key_word
USING gin
(to_tsvector('english'::regconfig, key_word_text) );

-- Index: idx_sckw_sc_id

-- DROP INDEX idx_sckw_sc_id;

CREATE INDEX idx_sckw_sc_id
ON semantified_content_key_word
USING btree
(semantified_content_id );

Following is the data

INSERT INTO semantified_content_key_word (id, semantified_content_id,
key_word_text, content_date, context_id) VALUES (7347, 7347, ',
agreementnumber customer servicecreditdate the guarantor taken exhausted
prior being pursuant avoidance doubt shall remain liable case non
incomplete', '2014-11-21 00:00:00', 111);

INSERT INTO semantified_content_key_word (id, semantified_content_id,
key_word_text, content_date, context_id) VALUES (7356, 7356, ', ;
agreementnumber agreementperiod aircraftmodel commencementdate customer
enginemodel enginequantity enginetype foddeductibleamount llpminimumbuild
servicecreditdate steppedpopularrate takeoffderate termdate turnaroundtime
ion ls initiated manager otherwise) inform whether proposed qualified view
lnltlated confirm satisfies criteria out article instruct programme
accordingly determined meet pursuant paragraph a) treated subject only g)
below b)', '2014-11-21 00:00:00', 111);

INSERT INTO semantified_content_key_word (id, semantified_content_id,
key_word_text, content_date, context_id) VALUES (7441, 7441, ',
activationdate agreementnumber enginemodel enginetype llpminimumbuild
servicecreditdate steppedpopularrate turnaroundtime leap-1a as united
continental customer 1/ neutral qec configuration engines shop maintenance:
each engine ', '2014-11-17 00:00:00', 111);

-------------------------------------------------------------
select sckw.*
FROM semantified_content_key_word sckw
where TO_TSVECTOR(sckw.key_word_text) @@ TO_TSQUERY('exhausted');

This is the query which i am running.And the keyword "exhausted" is present
only in one of the rows but i am getting all the 3 rows.

How to avoid the rows where the keyword is not present

I am running the following query in mysql i am getting the correct result

SELECT * ,MATCH(key_word_text) AGAINST('"exhausted"') FROM
semantified_content_key_word
WHERE MATCH(key_word_text) AGAINST('"exhausted"')

I want the equivalent query in postgres

Thanks
Prasanna

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-11 17:22:56 Re: regression, deadlock in high frequency single-row UPDATE
Previous Message prasanna 2014-12-11 13:36:48 BUG #12204: Getting wrong results from full text search