Re: Fixing GIN for empty/null/full-scan cases

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Fixing GIN for empty/null/full-scan cases
Date: 2011-01-04 23:18:57
Message-ID: AANLkTimdUq9iaDem_PnvQbe=CR2aPF2DpZxp9H+x4MND@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 4, 2011 at 4:49 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Jan 4, 2011 at 4:09 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> * Existing GIN indexes are upwards compatible so far as on-disk storage
>>> goes, but they will of course be missing entries for empty, null, or
>>> null-containing items.  Users who want to do searches that should find
>>> such items will need to reindex after updating to 9.1.
>
>> This is the only part of this proposal that bothers me a little bit.
>> It would be nice if the system could determine whether a GIN index is
>> "upgraded from 9.0 or earlier and thus doesn't contain these entries"
>> - and avoid trying to use the index for these sorts of queries in
>> cases where it might return wrong answers.
>
> I don't think it's really worth the trouble.  The GIN code has been
> broken for these types of queries since day one, and yet we've had only
> maybe half a dozen complaints about it.  Moreover there's no practical
> way to "avoid trying to use the index", since in many cases the fact
> that a query requires a full-index scan isn't determinable at plan time.
>
> The best we could really do is throw an error at indexscan start, and
> that doesn't seem all that helpful.  But it probably wouldn't take much
> code either, if you're satisfied with that answer.  (I'm envisioning
> adding a version ID to the GIN metapage and then checking that before
> proceeding with a full-index scan.)

I'd be satisfied with that answer. It at least makes it a lot more
clear when you've got a problem. If this were a more common scenario,
I'd probably advocate for a better solution, but the one you propose
seems adequate given the frequency of the problem as you describe it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2011-01-04 23:20:21 Re: Sync Rep Design
Previous Message Josh Berkus 2011-01-04 23:18:46 Re: Fixing GIN for empty/null/full-scan cases