From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Sushant Sinha <sushant354(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: phrase search |
Date: | 2008-07-22 18:42:03 |
Message-ID: | 488629FB.2030501@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>> 1. What is the meaning of such a query operator?
>>
>> foo #5 bar -> true if the document has word "foo" followed by "bar" at
>> 5th position.
>>
>> foo #<5 bar -> true if document has word "foo" followed by "bar" with in
>> 5 positions
>>
>> foo #>5 bar -> true if document has word "foo" followed by "bar" after 5
>> positions
Sounds good, but, may be it's an overkill.
>> etc .....
>>
>> 2. How to implement such query operators?
>>
>> Should we modify QueryItem to include additional distance information or
>> is there any other way to accomplish it?
>>
>> Is the following list sufficient to accomplish this?
>> a. Modify to_tsquery
>> b. Modify TS_execute in tsvector_op.c to check new operator
Exactly
>>
>> Is there anything needed in rewrite subsystem?
Yes, of course - rewrite system should support that operation.
>>
>> 3. Are these valid uses of the operators and if yes what would they
>> mean?
>>
>> foo #5 (bar & cup)
It must support! Because of lexize might return subtsquery. For example,
russian ispell can return several lexemes: "adfg" can become a 'adf | adfs |
ad', norwegian and german languages are more complicated: "abc" -> " (ab & c) |
(a & bc) | abc"
>> 4. If the operator only applies to two query items can we create an
>> index such that (foo, bar)-> documents[min distance, max distance]
>> How difficult it is to implement an index like this?
No, index should execute query 'foo & bar' and mark recheck flag to true to
execute 'foo #<5 bar' on original tsvector from table.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2008-07-22 18:56:36 | Transaction-controlled robustness for replication |
Previous Message | Shane Ambler | 2008-07-22 18:34:33 | Re: Do we really want to migrate plproxy and citext into PG core distribution? |