Quick Links

Re: Queryplan within FTS/GIN index -search.

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Jesper Krogh <jesper(at)krogh(dot)cc>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Queryplan within FTS/GIN index -search.
Date:	2009-10-22 22:56:56
Message-ID:	1256252216.31947.289.camel@monkey-cat.sm.truviso.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Thu, 2009-10-22 at 18:28 +0200, Jesper Krogh wrote:
> I somehow would expect the index-search to take advantage of the MCV's
> informations in the statistics that sort of translate it into a search
> and post-filtering (as PG's queryplanner usually does at the SQL-level).

MCVs are full values that are found in columns or indexes -- you aren't
likely to have two entire documents that are exactly equal, so MCVs are
useless in your example.

I believe that stop words are a more common way of accomplishing what
you want to do, but they are slightly more limited: they won't be
checked at any level, and so are best used for truly common words like
"and". From your example, I assume that you still want the word checked,
but it's not selective enough to be usefully checked by the index.

In effect, what you want are words that aren't searched (or stored) in
the index, but are included in the tsvector (so the RECHECK still
works). That sounds like it would solve your problem and it would reduce
index size, improve update performance, etc. I don't know how difficult
it would be to implement, but it sounds reasonable to me.

The only disadvantage is that it's more metadata to manage -- all of the
existing data like dictionaries and stop words, plus this new "common
words". Also, it would mean that almost every match requires RECHECK. It
would be interesting to know how common a word needs to be before it's
better to leave it out of the index.

Regards,
Jeff Davis

In response to

Queryplan within FTS/GIN index -search. at 2009-10-22 16:28:13 from Jesper Krogh

Responses

Re: Queryplan within FTS/GIN index -search. at 2009-10-23 05:18:32 from Jesper Krogh

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jesper Krogh	2009-10-23 05:18:32	Re: Queryplan within FTS/GIN index -search.
Previous Message	Scott Carey	2009-10-22 22:08:00	Re: Partitioned Tables and ORDER BY