Skip site navigation (1) Skip section navigation (2)

Re: tsearch2 poor performance

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Kris Kiger <kris(at)musicrebellion(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: tsearch2 poor performance
Date: 2004-09-27 20:35:54
Message-ID: Pine.GSO.4.58.0409280030230.14980@ra.sai.msu.su (view raw or flat)
Thread:
Lists: pgsql-adminpgsql-hackers
On Mon, 27 Sep 2004, Kris Kiger wrote:

> Yes, it is much better than no index of sequential scan.  We may just be
> looking at the best performance tsearch2 can offer on my machine.

Hmm,

tsearch2 with no index should be faster than LIKE, because tsearch2
already has *parsed* and *sorted* list.

It's interesting to fetch just 226,357 documents from disk and see processing
time.

select count(*) from product limit 226357 offset 1;


>
> search_test=#  explain analyze SELECT count(q) FROM product,
> to_tsquery('oil') AS q  WHERE vector @@ q;
>                                                             QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=67847264.50..67847264.50 rows=1 width=32) (actual
> time=83311.552..83311.555 rows=1 loops=1)
>    ->  Nested Loop  (cost=12.50..67839764.50 rows=3000001 width=32)
> (actual time=0.204..81960.198 rows=226357 loops=1)
>          Join Filter: ("outer".vector @@ "inner".q)
>          ->  Seq Scan on product  (cost=0.00..339752.00 rows=3000000
> width=32) (actual time=0.100..27415.795 rows=3000000 loops=1)
>          ->  Materialize  (cost=12.50..22.50 rows=1000 width=32) (actual
> time=0.003..0.006 rows=1 loops=3000000)
>                ->  Function Scan on q  (cost=0.00..12.50 rows=1000
> width=32) (actual time=0.020..0.024 rows=1 loops=1)
>  Total runtime: 83311.735 ms
> (7 rows)
>
> search_test=# explain analyze select count(*) from product where
> description like '% oil %';
>                                                        QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=347264.01..347264.01 rows=1 width=0) (actual
> time=39858.350..39858.353 rows=1 loops=1)
>    ->  Seq Scan on product  (cost=0.00..347252.00 rows=4801 width=0)
> (actual time=0.100..38320.293 rows=226357 loops=1)
>          Filter: (description ~~ '% oil %'::text)
>  Total runtime: 39858.491 ms
>
>
> >>Oleg,
> >>
> >>    Thanks for the help on this.
> >>
> >>    The query I used to return the 508 number is:
> >>         SELECT * FROM stat('SELECT vector FROM product') ORDER BY ndoc
> >>desc, word ;
> >>
> >>    Testing says, the more words I use, the faster the query is.  My
> >>original search word, 'oil', appears in 226,357 documents 233,266 times.
> >> As far as distinct words go, 'oil' is middle of the road for
> >>occurences.  As it is set up now, the best search time I am getting on
> >>this single word is roughly 22 seconds.
> >>
> >>
> >
> >Does this time (22 seconds) is still better than seq. scan (no index)
> >or standard 'LIKE' ?
> >
> >
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

pgsql-hackers by date

Next:From: Serguei A. MokhovDate: 2004-09-27 21:01:18
Subject: Compiling from CVS: undefined reference to `PQparameterStatus'
Previous:From: Kris KigerDate: 2004-09-27 18:56:36
Subject: Re: tsearch2 poor performance

pgsql-admin by date

Next:From: Qing ZhaoDate: 2004-09-27 23:03:20
Subject: killing a hung postgres process brings down the Postgres database server on MAC OS X!
Previous:From: Anjan DaveDate: 2004-09-27 19:47:42
Subject: moving pg_xlog

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group