Re: Tsearch2 performance on big database

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Rick Jansen <rick(at)rockingstone(dot)nl>
Cc: Mike Rylander <mrylander(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Tsearch2 performance on big database
Date: 2005-03-24 10:51:42
Message-ID: Pine.GSO.4.62.0503241343580.5508@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 24 Mar 2005, Rick Jansen wrote:

> Oleg Bartunov wrote:
>> from my notes
>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes
>>
>> It's usefull to see words statistics, for example, to check how good
>> your dictionaries work or how did you configure pg_ts_cfgmap. Also, you
>> may notice probable stop words relevant for your collection. Tsearch
>> provides stat() function:
>>
>> .......................
>>
>> Don't hesitate to read it and if you find some bugs or know better wording
>> I'd be glad to improve my notes.
>>
>
> Thanks, but that stat() query takes way too long.. I let it run for like
> 4 hours and still nothing. The database I am testing tsearch2 on is also
> the production database (mysql) server so I have to be careful not to
> use too many resources :o

stat() is indeed a bigdog, it was designed for developers needs,
so we recommend to save results in table.

>
> Anyway, here's my pg_ts_cfgmap now (well the relevant bits):
>
> default_english | lhword | {en_ispell,en_stem}
> default_english | lpart_hword | {en_ispell,en_stem}
> default_english | lword | {en_ispell,en_stem}
>
> Is it normal that queries for single words (or perhaps they are words
> that are common) take a really long time? Like this:
>

'hispanic' isn't common, I see you get only 674 rows and
'buckingham & palace' returns 185 rows. Did you run 'vacuum analyze' ?
I see a big discrepancy between estimated rows (8041) and actual rows.

> ilab=# explain analyze select count(*) from books where description_fti @@
> to_tsquery('default', 'hispanic');
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=20369.81..20369.81 rows=1 width=0) (actual
> time=261512.031..261512.031 rows=1 loops=1)
> -> Index Scan using idxfti_idx on books (cost=0.00..20349.70 rows=8041
> width=0) (actual time=45777.760..261509.288 rows=674 loops=1)
> Index Cond: (description_fti @@ '\'hispan\''::tsquery)
> Total runtime: 261518.529 ms
> (4 rows)
>
> ilab=# explain analyze select titel from books where description_fti @@
> to_tsquery('default', 'buckingham & palace');
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
> Index Scan using idxfti_idx on books (cost=0.00..20349.70 rows=8041
> width=57) (actual time=18992.045..48863.385 rows=185 loops=1)
> Index Cond: (description_fti @@ '\'buckingham\' & \'palac\''::tsquery)
> Total runtime: 48863.874 ms
> (3 rows)
>
>
> I dont know what happened, these queries were a lot faster 2 days ago..what
> the feck is going on?!
>
> Rick
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Jansen 2005-03-24 10:58:04 Re: Tsearch2 performance on big database
Previous Message Rick Jansen 2005-03-24 10:41:04 Re: Tsearch2 performance on big database