Re: tsearch2 headline and postgresql.conf

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: pgsql-performance(at)nullmx(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: tsearch2 headline and postgresql.conf
Date: 2006-01-22 08:24:55
Message-ID: Pine.GSO.4.63.0601221110190.14417@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You didn't provides us any query with explain analyze.
Just to make sure you're fine.

Oleg
On Sun, 22 Jan 2006, pgsql-performance(at)nullmx(dot)com wrote:

> Hi folks,
>
> I'm not sure if this is the right place for this but thought I'd ask. I'm
> relateively new to postgres having only used it on 3 projects and am just
> delving into the setup and admin for the second time.
>
> I decided to try tsearch2 for this project's search requirements but am
> having trouble attaining adequate performance. I think I've nailed it down
> to trouble with the headline() function in tsearch2.
> In short, there is a crawler that grabs HTML docs and places them in a
> database. The search is done using tsearch2 pretty much installed according
> to instructions. I have read a couple online guides suggested by this list
> for tuning the postgresql.conf file. I only made modest adjustments because
> I'm not working with top-end hardware and am still uncertain of the actual
> impact of the different paramenters.
>
> I've been learning 'explain' and over the course of reading I have done
> enough query tweaking to discover the source of my headache seems to be
> headline().
>
> On a query of 429 documents, of which the avg size of the stripped down
> document as stored is 21KB, and the max is 518KB (an anomaly), tsearch2
> performs exceptionally well returning most queries in about 100ms.
>
> On the other hand, following the tsearch2 guide which suggests returning that
> first portion as a subquery and then generating the headline() from those
> results, I see the query increase to 4 seconds!
>
> This seems to be directly related to document size. If I filter out that
> 518KB doc along with some 100KB docs by returning "substring( stripped_text
> FROM 0 FOR 50000) AS stripped_text" I decrease the time to 1.4 seconds, but
> increase the risk of not getting a headline.
>
> Seeing as how this problem is directly tied to document size, I'm wondering
> if there are any specific settings in postgresql.conf that may help, or is
> this just a fact of life for the headline() function? Or, does anyone know
> what the problem is and how to overcome it?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message August Zajonc 2006-01-22 16:04:29 Re: Suspending SELECTs
Previous Message pgsql-performance 2006-01-22 07:46:50 tsearch2 headline and postgresql.conf