Skip site navigation (1) Skip section navigation (2)

Re: tsearch2 poor performance

From: Kris Kiger <kris(at)musicrebellion(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: tsearch2 poor performance
Date: 2004-09-27 18:56:36
Message-ID: 41586264.5020302@musicrebellion.com (view raw or flat)
Thread:
Lists: pgsql-adminpgsql-hackers
Yes, it is much better than no index of sequential scan.  We may just be 
looking at the best performance tsearch2 can offer on my machine.  

search_test=#  explain analyze SELECT count(q) FROM product, 
to_tsquery('oil') AS q  WHERE vector @@ q;
                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=67847264.50..67847264.50 rows=1 width=32) (actual 
time=83311.552..83311.555 rows=1 loops=1)
   ->  Nested Loop  (cost=12.50..67839764.50 rows=3000001 width=32) 
(actual time=0.204..81960.198 rows=226357 loops=1)
         Join Filter: ("outer".vector @@ "inner".q)
         ->  Seq Scan on product  (cost=0.00..339752.00 rows=3000000 
width=32) (actual time=0.100..27415.795 rows=3000000 loops=1)
         ->  Materialize  (cost=12.50..22.50 rows=1000 width=32) (actual 
time=0.003..0.006 rows=1 loops=3000000)
               ->  Function Scan on q  (cost=0.00..12.50 rows=1000 
width=32) (actual time=0.020..0.024 rows=1 loops=1)
 Total runtime: 83311.735 ms
(7 rows)

search_test=# explain analyze select count(*) from product where 
description like '% oil %';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=347264.01..347264.01 rows=1 width=0) (actual 
time=39858.350..39858.353 rows=1 loops=1)
   ->  Seq Scan on product  (cost=0.00..347252.00 rows=4801 width=0) 
(actual time=0.100..38320.293 rows=226357 loops=1)
         Filter: (description ~~ '% oil %'::text)
 Total runtime: 39858.491 ms


>>Oleg,
>>
>>    Thanks for the help on this.
>>
>>    The query I used to return the 508 number is:
>>         SELECT * FROM stat('SELECT vector FROM product') ORDER BY ndoc
>>desc, word ;
>>
>>    Testing says, the more words I use, the faster the query is.  My
>>original search word, 'oil', appears in 226,357 documents 233,266 times.
>> As far as distinct words go, 'oil' is middle of the road for
>>occurences.  As it is set up now, the best search time I am getting on
>>this single word is roughly 22 seconds.
>>    
>>
>
>Does this time (22 seconds) is still better than seq. scan (no index)
>or standard 'LIKE' ?
>
>  
>



In response to

Responses

pgsql-hackers by date

Next:From: Oleg BartunovDate: 2004-09-27 20:35:54
Subject: Re: tsearch2 poor performance
Previous:From: Joshua D. DrakeDate: 2004-09-27 18:46:43
Subject: Re: tsearch2 poor performance

pgsql-admin by date

Next:From: Anjan DaveDate: 2004-09-27 19:47:42
Subject: moving pg_xlog
Previous:From: Joshua D. DrakeDate: 2004-09-27 18:46:43
Subject: Re: tsearch2 poor performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group