From: | karavelov(at)mail(dot)bg |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | TS: Limited cover density ranking |
Date: | 2012-01-27 16:06:53 |
Message-ID: | c4bd0b01f3398372af9572f4913bdb6c.mailbg@mail.bg |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I have developed a variation of cover density ranking functions that counts only covers that are lesser than a specified limit. It is useful for finding combinations of terms that appear nearby one another. Here is an example of usage:
-- normal cover density ranking : not changed
luben=> select ts_rank_cd(to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
ts_rank_cd
------------
0.0333333
(1 row)
-- limited to 2
luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
ts_rank_cd
------------
0
(1 row)
luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d'));
ts_rank_cd
------------
0.1
(1 row)
-- limited to 3
luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
ts_rank_cd
------------
0.0333333
(1 row)
luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d'));
ts_rank_cd
------------
0.133333
(1 row)
Find attached a path agains 9.1.2 sources. I preferred to make a patch, not a separate extension because it is only 1 statement change in calc_rank_cd function. If I have to make an extension a lot of code would be duplicated between backend/utils/adt/tsrank.c and the extension.
I have some questions:
1. Is it interesting to develop it further (documentation, cleanup, etc) for inclusion in one of the next versions? If this is the case, there are some further questions:
- should I overload ts_rank_cd (as in examples above and the patch) or should I define new set of functions, for example ts_rank_lcd ?
- should I define define this new sql level functions in core or should I go only with this 2 lines change in calc_rank_cd() and define the new functions as an extension? If we prefer the later, could I overload core functions with functions defined in extensions?
- and finally there is always the possibility to duplicate the code and make an independent extension.
2. If I run the patched version on cluster that was initialized with unpatched server, is there a way to register the new functions in the system catalog without reinitializing the cluster?
Best regards
luben
--
Luben Karavelov
From | Date | Subject | |
---|---|---|---|
Next Message | karavelov | 2012-01-27 16:10:14 | Re: TS: Limited cover density ranking |
Previous Message | Robert Haas | 2012-01-27 15:58:11 | Re: patch for parallel pg_dump |