Re: Queryplan within FTS/GIN index -search.

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jesper Krogh <jesper(at)krogh(dot)cc>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Queryplan within FTS/GIN index -search.
Date: 2009-10-23 05:39:18
Message-ID: 1256276358.2580.794.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, 2009-10-23 at 07:18 +0200, Jesper Krogh wrote:
> This is indeed information on individual terms from the statistics that
> enable this.

My mistake, I didn't know it was that smart about it.

> > In effect, what you want are words that aren't searched (or stored) in
> > the index, but are included in the tsvector (so the RECHECK still
> > works). That sounds like it would solve your problem and it would reduce
> > index size, improve update performance, etc. I don't know how difficult
> > it would be to implement, but it sounds reasonable to me.

> That sounds like it could require an index rebuild if the distribution
> changes?

My thought was that the common words could be declared to be common the
same way stop words are. As long as words are only added to this list,
it should be OK.

> That would be another plan to pursue, but the MCV is allready there

The problem with MCVs is that the index search can never eliminate
documents because they don't contain a match, because it might contain a
match that was previously an MCV, but is no longer.

Also, MCVs are relatively few -- you only get ~1000 or so. There might
be a lot of common words you'd like to track.

Perhaps ANALYZE can automatically add the common words above some
frequency threshold to the list?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scara Maccai 2009-10-23 07:23:06 Re: Table Clustering & Time Range Queries
Previous Message Jesper Krogh 2009-10-23 05:18:32 Re: Queryplan within FTS/GIN index -search.