GIN versus zero-key queries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: GIN versus zero-key queries
Date: 2009-03-25 17:25:01
Message-ID: 4076.1238001901@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Our fine manual sayeth (in section 52.5)

When extractQuery returns zero keys, GIN will emit an error. Depending
on the operator, a void query might match all, some, or none of the
indexed values (for example, every array contains the empty array, but
does not overlap the empty array), and GIN cannot determine the correct
answer, nor produce a full-index-scan result if it could determine that
that was correct.

However, the behavior actually implemented by newScanKey() doesn't seem
to agree with this. If there are multiple scankeys (ie, multiple
indexable clauses) then what actually happens is you get an error
report only if *all* the clauses are zero-key queries. If some clauses
are zero-key and some are normal, it effectively ignores the zero-key
ones and sails ahead with the normal ones. This amounts to assuming
that the zero-key queries match all possible indexed values. But as
noted by the manual, that's not a correct conclusion for some operator
semantics.

I am not sure whether the statement in 52.5 is still accurate, though.
We have an API definition by which extractQuery can distinguish "all
match" from "no match". If we just legislate that "some match" isn't
a valid behavior for zero-key queries, then the code is correct and the
documentation is wrong. However, if the above quote is correct, then
I think newScanKey() is buggy.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Willis 2009-03-25 17:36:12 Re: Proper entry of polygon type data
Previous Message Andrew Dunstan 2009-03-25 17:13:32 Re: cached plan issue in trigger func