Skip site navigation (1) Skip section navigation (2)

Re: [GENERAL] Incorrect FTS result with GIN index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [GENERAL] Incorrect FTS result with GIN index
Date: 2010-07-28 23:33:01
Message-ID: 26432.1280359981@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump

Hmm ... I'm not sure why you're failing to reproduce it, because it's
falling over pretty easily for me.  After poking at it for awhile,
I am of the opinion that scanGetItem's handling of multiple keys is
fundamentally broken and needs to be rewritten completely.  The
particular case I'm seeing here is that one key returns this sequence of
TIDs/lossy flags:

...
1085/4 0
1086/65535 1
1087/4 0
...

while the other one returns this:

...
1083/11 0
1086/6 0
1086/10 0
1087/10 0
...

and what comes out of scanGetItem is just

...
1086/6 1
...

because after returning that, on the next call it advances both input
keystreams.  So 1086/10 should be visited and is not.

I think that depending on the previous entryRes state to determine what
to do is basically unworkable, and what should probably be done instead
is to remember the last-returned TID and advance keystreams with TIDs <=
that.  I haven't quite thought through how that should interact with
lossy-page TIDs but it seems more robust than what we've got.

I'm also noticing that the ANDing behavior for the "ee:* & dd:*" query
style seems very much stupider than it needs to be --- it's returning
lossy pages that very obviously don't need to be examined because the
other keystream has no match at all on that page.  But I haven't had
time to probe into the reason why.

I'm out of time for today, do you want to work on it?

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2010-07-28 23:36:05
Subject: Re: do we need to postpone beta4?
Previous:From: Daniel FarinaDate: 2010-07-28 23:22:04
Subject: Re: documentation for committing with git

pgsql-general by date

Next:From: Martin GaintyDate: 2010-07-29 00:38:53
Subject: Re: Which CMS/Ecommerce/Shopping cart ?
Previous:From: Pierre ThibaultDate: 2010-07-28 23:06:51
Subject: Dynamic data model, locks and performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group