Skip site navigation (1) Skip section navigation (2)

Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl>, pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords
Date: 2007-11-14 17:29:40
Message-ID: Pine.LNX.4.64.0711142023400.7787@sn.sai.msu.ru (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
On Wed, 14 Nov 2007, Tom Lane wrote:

> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> Let's consider one example - removing accents.
>> In the past I always recommend people to use regex functions before
>> to_tsvector conversion to remove accents, but recently I was noticed that
>> such trick doesn't work with headline(). So, the only way is to have
>> special dictionary dict_remove_accent before, which  works as a filter.
>
>> I don't remember why do we left this for future releases, though.
>
> That would require a system-to-dictionary API change (to be able to
> modify the token under inspection), no?  So it's certainly something

It requires one reserved option for dictionaries and  ability to get dictionary 
option.  Unless somebody have dictionary with the same option, this change
looks harmless.

> I'd say is too late for 8.3.

yes, probably we get better idea.

>
> One thought that came to mind is that the option name should be just
> "Accept" not "AcceptAll".  To me "All" implies that it would accept
> *everything* ... including stopwords.

wait, I remind the problem with filters. How it will works with thesaurus ?

 	Regards,
 		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2007-11-14 17:37:16
Subject: Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords
Previous:From: Tom LaneDate: 2007-11-14 17:17:25
Subject: Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords

pgsql-patches by date

Next:From: Tom LaneDate: 2007-11-14 17:37:16
Subject: Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords
Previous:From: Tom LaneDate: 2007-11-14 17:17:25
Subject: Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group