Skip site navigation (1) Skip section navigation (2)

Re: Queryplan within FTS/GIN index -search.

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jesper Krogh <jesper(at)krogh(dot)cc>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Queryplan within FTS/GIN index -search.
Date: 2009-10-22 22:56:56
Message-ID: 1256252216.31947.289.camel@monkey-cat.sm.truviso.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Thu, 2009-10-22 at 18:28 +0200, Jesper Krogh wrote:
> I somehow would expect the index-search to take advantage of the MCV's
> informations in the statistics that sort of translate it into a search
> and post-filtering (as PG's queryplanner usually does at the SQL-level).

MCVs are full values that are found in columns or indexes -- you aren't
likely to have two entire documents that are exactly equal, so MCVs are
useless in your example.

I believe that stop words are a more common way of accomplishing what
you want to do, but they are slightly more limited: they won't be
checked at any level, and so are best used for truly common words like
"and". From your example, I assume that you still want the word checked,
but it's not selective enough to be usefully checked by the index.

In effect, what you want are words that aren't searched (or stored) in
the index, but are included in the tsvector (so the RECHECK still
works). That sounds like it would solve your problem and it would reduce
index size, improve update performance, etc. I don't know how difficult
it would be to implement, but it sounds reasonable to me.

The only disadvantage is that it's more metadata to manage -- all of the
existing data like dictionaries and stop words, plus this new "common
words". Also, it would mean that almost every match requires RECHECK. It
would be interesting to know how common a word needs to be before it's
better to leave it out of the index.

Regards,
	Jeff Davis


In response to

Responses

pgsql-performance by date

Next:From: Jesper KroghDate: 2009-10-23 05:18:32
Subject: Re: Queryplan within FTS/GIN index -search.
Previous:From: Scott CareyDate: 2009-10-22 22:08:00
Subject: Re: Partitioned Tables and ORDER BY

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group