Re: Tsearch2 really slower than ilike ?

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Hervé Piedvache <herve(at)elma(dot)fr>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tsearch2 really slower than ilike ?
Date: 2004-11-16 21:13:08
Message-ID: Pine.GSO.4.61.0411170011200.23911@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

ok, I downloaded dump of table and here is what I found:

zz=# select count(*) from tt;
count
--------
183956
(1 row)

zz=# select * from stat('select tt from tt') order by ndoc desc, nentry
desc,wo
rd limit 10;
word | ndoc | nentry
--------------+-------+--------
blog | 12710 | 12835
weblog | 4857 | 4859
news | 4402 | 4594
life | 4136 | 4160
world | 1980 | 1986
journal | 1882 | 1883
livejourn | 1737 | 1737
thought | 1669 | 1677
web | 1154 | 1161
scotsman.com | 1138 | 1138
(10 rows)

zz=# explain analyze select tt from tt where tt @@ 'blog';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using tt_idx on tt (cost=0.00..728.83 rows=184 width=32) (actual time=0.047..141.110 rows=12710 loops=1)
Index Cond: (tt @@ '\'blog\''::tsquery)
Filter: (tt @@ '\'blog\''::tsquery)
Total runtime: 154.105 ms
(4 rows)

It's really fast ! So, I don't understand your problem.
I run query on my desktop machine, nothing special.

Oleg
On Tue, 16 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:

> Hi,
>
> I'm completly dispointed with Tsearch2 ...
>
> I have a table like this :
> Table "public.site"
> Column | Type |
> Modifiers
> ---------------+-----------------------------+---------------------------------------------------------------
> id_site | integer | not null default
> nextval('public.site_id_site_seq'::text)
> site_name | text |
> site_url | text |
> url | text |
> language | text |
> datecrea | date | default now()
> id_category | integer |
> time_refresh | integer |
> active | integer |
> error | integer |
> description | text |
> version | text |
> idx_site_name | tsvector |
> lastcheck | date |
> lastupdate | timestamp without time zone |
> Indexes:
> "site_id_site_key" unique, btree (id_site)
> "ix_idx_site_name" gist (idx_site_name)
> Triggers:
> tsvectorupdate_site_name BEFORE INSERT OR UPDATE ON site FOR EACH ROW
> EXECUTE PROCEDURE tsearch2('idx_site_name', 'site_name')
>
> I have 183 956 records in the database ...
>
> SELECT s.site_name, s.id_site, s.description, s.site_url,
> case when exists (select id_user
> from user_choice u
> where u.id_site=s.id_site
> and u.id_user = 1) then 1
> else 0 end as bookmarked
> FROM site s
> WHERE s.idx_site_name @@ to_tsquery('atari');
>
> Explain Analyze :
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using ix_idx_site_name on site s (cost=0.00..1202.12 rows=184
> width=158) (actual time=4687.674..4698.422 rows=1 loops=1)
> Index Cond: (idx_site_name @@ '\'atari\''::tsquery)
> Filter: (idx_site_name @@ '\'atari\''::tsquery)
> SubPlan
> -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4) (actual
> time=0.232..0.232 rows=0 loops=1)
> Filter: ((id_site = $0) AND (id_user = 1))
> Total runtime: 4698.608 ms
>
> First time I run the request I have a result in about 28 seconds.
>
> SELECT s.site_name, s.id_site, s.description, s.site_url,
> case when exists (select id_user
> from user_choice u
> where u.id_site=s.id_site
> and u.id_user = 1) then 1
> else 0 end as bookmarked
> FROM site_rss s
> WHERE s.site_name ilike '%atari%'
>
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> Seq Scan on site_rss s (cost=0.00..11863.16 rows=295 width=158) (actual
> time=17.414..791.937 rows=12 loops=1)
> Filter: (site_name ~~* '%atari%'::text)
> SubPlan
> -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4) (actual
> time=0.222..0.222 rows=0 loops=12)
> Filter: ((id_site = $0) AND (id_user = 1))
> Total runtime: 792.099 ms
>
> First time I run the request I have a result in about 789 miliseconds !!???
>
> I'm using PostgreSQL v7.4.6 with a Bi-Penitum III 933 Mhz and 1 Gb of RAM.
>
> Any idea ... ? For the moment I'm going back to use the ilike solution ... but
> I was really thinking that Tsearch2 could be a better solution ...
>
> Regards,
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2004-11-16 21:29:56 Re: mis-estimation on data-warehouse aggregate creation
Previous Message Alexandre Leclerc 2004-11-16 21:11:30 nuderstanding 'explain analyse'