Re: Full text search - query plan? PG 8.4.1

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jesper Krogh <jesper(at)krogh(dot)cc>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Full text search - query plan? PG 8.4.1
Date: 2009-10-18 18:20:19
Message-ID: 8636.1255890019@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jesper Krogh <jesper(at)krogh(dot)cc> writes:
> "commonterm" matches 37K of the 50K documents (majority), but the query
> plan is "odd" in my eyes.

> * Why does it mis-guess the cost of a Seq Scan on textbody so much?

The cost looks about right to me. The cost units are not milliseconds.

> * Why doesn't it use the index in "id" to fetch the 10 records?

You haven't *got* an index on id, according to the \d output.

The only part of your results that looks odd to me is the very high cost
estimate for the bitmapscan:

> -> Bitmap Heap Scan on textbody (cost=267377.23..269147.80
> rows=36771 width=5) (actual time=15.763..30.576 rows=37133 loops=1)
> Recheck Cond: (textbody_body_fts @@
> to_tsquery('commonterm'::text))
> -> Bitmap Index Scan on textbody_tfs_idx
> (cost=0.00..267368.04 rows=36771 width=0) (actual time=15.419..15.419
> rows=37134 loops=1)
> Index Cond: (textbody_body_fts @@
> to_tsquery('commonterm'::text))

When I try this with a 64K-row table having 'commonterm' in half of the
rows, what I get is estimates of 1530 cost units for the seqscan and
1405 for the bitmapscan (so it prefers the latter). It will switch over
to using an index on id if I add one, but that's not the point at the
moment. There's something strange about your tsvector index. Maybe
it's really huge because the documents are huge?

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jesper Krogh 2009-10-18 18:55:49 Re: Full text search - query plan? PG 8.4.1
Previous Message Jeff Janes 2009-10-18 17:59:04 Re: Calculation of unused columns