Skip site navigation (1) Skip section navigation (2)

Re: Full text search - query plan? PG 8.4.1

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Full text search - query plan? PG 8.4.1
Date: 2009-10-18 18:55:49
Message-ID: 4ADB64B5.8040900@krogh.cc (view raw or flat)
Thread:
Lists: pgsql-performance
Tom Lane wrote:
> Jesper Krogh <jesper(at)krogh(dot)cc> writes:
>> "commonterm" matches 37K of the 50K documents (majority), but the query
>> plan is "odd" in my eyes.
> 
>> * Why does it mis-guess the cost of a Seq Scan on textbody so much?
> 
> The cost looks about right to me.  The cost units are not milliseconds.
> 
>> * Why doesn't it use the index in "id" to fetch the 10 records?
> 
> You haven't *got* an index on id, according to the \d output.

Thanks (/me bangs my head against the table). I somehow assumed that "id
SERIAL" automatically created it for me. Even enough to not looking for
it to confirm.

> The only part of your results that looks odd to me is the very high cost
> estimate for the bitmapscan:
> 
>>          ->  Bitmap Heap Scan on textbody  (cost=267377.23..269147.80
>> rows=36771 width=5) (actual time=15.763..30.576 rows=37133 loops=1)
>>                Recheck Cond: (textbody_body_fts @@
>> to_tsquery('commonterm'::text))
>>                ->  Bitmap Index Scan on textbody_tfs_idx
>> (cost=0.00..267368.04 rows=36771 width=0) (actual time=15.419..15.419
>> rows=37134 loops=1)
>>                      Index Cond: (textbody_body_fts @@
>> to_tsquery('commonterm'::text))
> 
> When I try this with a 64K-row table having 'commonterm' in half of the
> rows, what I get is estimates of 1530 cost units for the seqscan and
> 1405 for the bitmapscan (so it prefers the latter).  It will switch over
> to using an index on id if I add one, but that's not the point at the
> moment.  There's something strange about your tsvector index.  Maybe
> it's really huge because the documents are huge?

huge is a relative term, but length(ts_vector(body)) is about 200 for
each document. Is that huge? I can postprocess them a bit to get it down
and will eventually do that before going to "production".

Thanks alot.

-- 
Jesper

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2009-10-18 19:48:18
Subject: Re: Full text search - query plan? PG 8.4.1
Previous:From: Tom LaneDate: 2009-10-18 18:20:19
Subject: Re: Full text search - query plan? PG 8.4.1

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group