Re: [GENERAL] Fragments in tsearch2 headline

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: sushant354(at)gmail(dot)com
Cc: Pierre-Yves Strub <pierre(dot)yves(dot)strub(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] Fragments in tsearch2 headline
Date: 2008-06-05 16:21:26
Message-ID: 48481286.5080103@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> A couple of caveats:
>
> 1. ts_headline testing was done with current cvs head where as
> headline_with_fragments was done with postgres 8.3.1.
> 2. For headline_with_fragments, TSVector for the document was obtained
> by joining with another table.
> Are these differences understandable?

That is possible situation because ts_headline has several criterias of 'best'
covers - length, number of words from query, good words at the begin and at the
end of headline while your fragment's algorithm takes care only on total number
of words in all covers. It's not very good, but it's acceptable, I think.
Headline (and ranking too) hasn't any formal rules to define is it good or bad?
Just a people's opinions.

Next possible reason: original algorithm had a look on all covers trying to find
the best one while your algorithm tries to find just the shortest covers to fill
a headline.

But it's very desirable to use ShortWord - it's not very comfortable for user if
one option produces unobvious side effect with another one.
`

> If you think these caveats are the reasons or there is something I am
> missing, then I can repeat the entire experiments with exactly the same
> conditions.

Interesting for me test is a comparing hlCover with Cover in your patch, i.e.
develop a patch which uses hlCover instead of Cover and compare old patch with
new one.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Colin Wetherbee 2008-06-05 16:26:43 Re: postgres connection problem via python pg DBI
Previous Message Bjørn T Johansen 2008-06-05 16:02:36 Re: How can I compare sql create script with running database?

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-06-05 16:24:00 Re: Overhauling GUCS
Previous Message Tom Lane 2008-06-05 15:49:18 Re: Overhauling GUCS