From: | Nathan Wagner <nw+pg(at)hydaspes(dot)if(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: bugs and bug tracking |
Date: | 2015-10-08 17:11:20 |
Message-ID: | 20151008171120.GA5136@granicus.if.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 07, 2015 at 03:06:50PM -0400, Stephen Frost wrote:
> * Nathan Wagner (nw+pg(at)hydaspes(dot)if(dot)org) wrote:
> > I have added full text searching to my tracker. I only index the first
> > 50 KB of each message. There's apparently a one MB limit on that
> > anyway, which a few messages exceed. I figure anything important is
> > probably in the first 50KB. I could be wrong. I could re-index fairly
> > easily. It seems to work pretty well.
>
> Note that we have FTS for the -bugs, and all the other, mailing lists..
True, but that finds emails. The search I have finds bugs (well, bug reports
anyway). Specifically, I have the following function:
create or replace function bugvector(bugid bigint)
returns tsvector language 'sql' as $$
select tsvagg(
setweight(to_tsvector(substr(body(msg), 1, 50*1024)), 'D')
||
setweight(to_tsvector(header_value(msg, 'Subject')), 'C')
)
from emails
where bug = $1
$$ strict;
which, as you can see, collects into one tsvector all the emails associated
with that particular bug. So a search hit is for the whole bug. There's
probably some search artifacts here. I suspect a bug with a long email thread
will be ranked higher than a one with a short thread. Perhaps that's ok
though.
--
nw
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-10-08 17:13:36 | Re: More work on SortSupport for text - strcoll() and strxfrm() caching |
Previous Message | Dean Rasheed | 2015-10-08 16:50:12 | Re: RLS bug in expanding security quals |