Re: Tsearch2 really slower than ilike ?

From: Hervé Piedvache <herve(at)elma(dot)fr>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tsearch2 really slower than ilike ?
Date: 2004-11-17 17:16:10
Message-ID: 200411171816.10794.herve@elma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Oleg,

Sorry but when I do your request I get :
# select id_site from site where idx_site_name @@  'livejourn';
ERROR:  type " " does not exist

What is this ?

(private: I don't know what happend with my mail, but I do nothing special to
disturb the contains when I'm writting to you ...)

Le Mardi 16 Novembre 2004 22:13, Oleg Bartunov a écrit :
> ok, I downloaded dump of table and here is what I found:
>
> zz=# select count(*) from tt;
> count
> --------
> 183956
> (1 row)
>
> zz=# select * from stat('select tt from tt') order by ndoc desc, nentry
> desc,wo
> rd limit 10;
> word | ndoc | nentry
> --------------+-------+--------
> blog | 12710 | 12835
> weblog | 4857 | 4859
> news | 4402 | 4594
> life | 4136 | 4160
> world | 1980 | 1986
> journal | 1882 | 1883
> livejourn | 1737 | 1737
> thought | 1669 | 1677
> web | 1154 | 1161
> scotsman.com | 1138 | 1138
> (10 rows)
>
> zz=# explain analyze select tt from tt where tt @@ 'blog';
> QUERY PLAN
> ---------------------------------------------------------------------------
>------------------------------------------- Index Scan using tt_idx on tt
> (cost=0.00..728.83 rows=184 width=32) (actual time=0.047..141.110
> rows=12710 loops=1) Index Cond: (tt @@ '\'blog\''::tsquery)
> Filter: (tt @@ '\'blog\''::tsquery)
> Total runtime: 154.105 ms
> (4 rows)
>
> It's really fast ! So, I don't understand your problem.
> I run query on my desktop machine, nothing special.
>
>
> Oleg
>
> On Tue, 16 Nov 2004, [iso-8859-15] Herv? Piedvache wrote:
> > Hi,
> >
> > I'm completly dispointed with Tsearch2 ...
> >
> > I have a table like this :
> > Table "public.site"
> > Column | Type |
> > Modifiers
> > ---------------+-----------------------------+---------------------------
> >------------------------------------ id_site | integer
> > | not null default
> > nextval('public.site_id_site_seq'::text)
> > site_name | text |
> > site_url | text |
> > url | text |
> > language | text |
> > datecrea | date | default now()
> > id_category | integer |
> > time_refresh | integer |
> > active | integer |
> > error | integer |
> > description | text |
> > version | text |
> > idx_site_name | tsvector |
> > lastcheck | date |
> > lastupdate | timestamp without time zone |
> > Indexes:
> > "site_id_site_key" unique, btree (id_site)
> > "ix_idx_site_name" gist (idx_site_name)
> > Triggers:
> > tsvectorupdate_site_name BEFORE INSERT OR UPDATE ON site FOR EACH ROW
> > EXECUTE PROCEDURE tsearch2('idx_site_name', 'site_name')
> >
> > I have 183 956 records in the database ...
> >
> > SELECT s.site_name, s.id_site, s.description, s.site_url,
> > case when exists (select id_user
> > from user_choice u
> > where u.id_site=s.id_site
> > and u.id_user = 1)
> > then 1 else 0 end as bookmarked
> > FROM site s
> > WHERE s.idx_site_name @@ to_tsquery('atari');
> >
> > Explain Analyze :
> > QUERY PLAN
> > -------------------------------------------------------------------------
> >----------------------------------------------------------------- Index
> > Scan using ix_idx_site_name on site s (cost=0.00..1202.12 rows=184
> > width=158) (actual time=4687.674..4698.422 rows=1 loops=1)
> > Index Cond: (idx_site_name @@ '\'atari\''::tsquery)
> > Filter: (idx_site_name @@ '\'atari\''::tsquery)
> > SubPlan
> > -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4)
> > (actual time=0.232..0.232 rows=0 loops=1)
> > Filter: ((id_site = $0) AND (id_user = 1))
> > Total runtime: 4698.608 ms
> >
> > First time I run the request I have a result in about 28 seconds.
> >
> > SELECT s.site_name, s.id_site, s.description, s.site_url,
> > case when exists (select id_user
> > from user_choice u
> > where u.id_site=s.id_site
> > and u.id_user = 1)
> > then 1 else 0 end as bookmarked
> > FROM site_rss s
> > WHERE s.site_name ilike '%atari%'
> >
> > QUERY PLAN
> > -------------------------------------------------------------------------
> >--------------------------------------- Seq Scan on site_rss s
> > (cost=0.00..11863.16 rows=295 width=158) (actual time=17.414..791.937
> > rows=12 loops=1)
> > Filter: (site_name ~~* '%atari%'::text)
> > SubPlan
> > -> Seq Scan on user_choice u (cost=0.00..3.46 rows=1 width=4)
> > (actual time=0.222..0.222 rows=0 loops=12)
> > Filter: ((id_site = $0) AND (id_user = 1))
> > Total runtime: 792.099 ms
> >
> > First time I run the request I have a result in about 789 miliseconds
> > !!???
> >
> > I'm using PostgreSQL v7.4.6 with a Bi-Penitum III 933 Mhz and 1 Gb of
> > RAM.
> >
> > Any idea ... ? For the moment I'm going back to use the ilike solution
> > ... but I was really thinking that Tsearch2 could be a better solution
> > ...
> >
> > Regards,
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match

--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Oleg Bartunov 2004-11-17 17:23:35 Re: Tsearch2 really slower than ilike ?
Previous Message Darcy Buskermolen 2004-11-17 17:13:09 Re: memcached and PostgreSQL