From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | obartunov(at)gmail(dot)com, Jean-Pierre Pelletier <jppelletier(at)e-djuster(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ? |
Date: | 2016-06-17 14:06:52 |
Message-ID: | 576403FC.8000702@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>>> So I think there's a reasonable case for decreeing that <N> should only
>>> match lexemes *exactly* N apart. If we did that, we would no longer have
>>> the misbehavior that Jean-Pierre is complaining about, and we'd not need
>>> to argue about whether <0> needs to be treated specially.
>
>> Agree, seems that's easy to change.
>> ...
>> Patch is attached
>
> Hmm, couldn't the loop logic be simplified a great deal if this is the
> definition? Or are you leaving it like that with the idea that we might
> later introduce another operator with the less-than-or-equal behavior?
Do you suggest something like merge join of two sorted lists? ie:
while(Rpos < Rdata.pos + Rdata.npos && Lpos < Ldata.pos + Ldata.npos)
{
if (*Lpos > *Rpos)
Rpos++;
else if (*Lpos < *Rpos)
{
if (*Rpos - *Lpos == distance)
match!
Lpos++;
}
else
{
if (distance == 0)
match!
Lpos++; Rpos++;
}
}
Such algorithm finds closest pair of (Lpos, Rpos) but satisfying pair could be
not closest, example: to_tsvector('simple', '1 2 1 2') @@ '1 <3> 2';
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-06-17 14:26:22 | Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ? |
Previous Message | Tom Lane | 2016-06-17 14:01:12 | Re: Restriction of windows functions |