Re: TS: Limited cover density ranking

From: karavelov(at)mail(dot)bg
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TS: Limited cover density ranking
Date: 2012-01-29 01:41:35
Message-ID: 3c8f10c1e2ffb295f4013e020723249f.mailbg@mail.bg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

----- Цитат от Oleg Bartunov (oleg(at)sai(dot)msu(dot)su), на 28.01.2012 в 21:04 -----

> I suggest you work on more general approach, see
> http://www.sai.msu.su/~megera/wiki/2009-08-12 for example.
>
> btw, I don't like you changed ts_rank_cd arguments.

Hello Oleg,

Thanks for the feedback.

Is it OK to begin with adding an exta argument and check in calc_rank_cd?

I could change the function names in order not to overload ts_rank_cd
arguments. My proposition is :

at sql level:
ts_rank_lcd([weights], tsvector, tsquery, limit, [method])

at C level:
ts_ranklcd_wttlf
ts_ranklcd_wttl
ts_ranklcd_ttlf
ts_ranklcd_ttl

Adding the functions could be done as an extension but they are just
trampolines into calc_rank_cd().

I agree that what you describe in the wiki page is more general approach. So this :

SELECT ts_rank_lcd(to_tsvector('a b c'), to_tsquery('a&c'),2 )>0;

could be replaced with

SELECT to_tsvector('a b c') @@ to_tsquery('(a ?2 c)|(c ?2 a) ');

but if we need to look for 3 or more nearby terms without order the tsquery
with '?' operator will became quite complicated. For example

SELECT tsvec @@
'(a ? b ? c) | (a ? c ? b) | (b ? a ? c) | (b ? c ? a) | (c ? a ? b) | (c ? b ? a)'::tsquery;

is the same as

SELECT ts_rank_lcd(tsvec, 'a&b&c'::tsquery,2)>0;

So this is the reason to think that the general approach does not exclude the the
usefulness of the approach that I am proposing.

Best regards

--
Luben Karavelov

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2012-01-29 07:47:28 Index-only scan performance regression
Previous Message Tom Lane 2012-01-29 01:06:36 Re: pg_dumpall and temp_tablespaces dependency problem