Re: Tsearch2 and Snowball

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tsearch2 and Snowball
Date: 2006-10-04 07:49:23
Message-ID: Pine.GSO.4.63.0610041147030.18168@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon,

We have almost everything you listed in our TODO
http://www.sai.msu.su/~megera/wiki/todo

btw, there is gendict subdirectory, which help people to generate
dictionaries (including snowball stemmers) for tsearch2.

Oleg

On Tue, 3 Oct 2006, Simon Riggs wrote:

>
> I'm looking at some of the code in contrib/tsearch2/snowball and see
> that the code there is *generated* code. The Snowball stemmer produces
> this C code in much the same way bison reads gram.y
>
> My understanding is that the Snowball code moves forwards regularly and
> there are many other stemmers we could be including with the
> distribution.
>
> Snowball has a BSD licence: http://snowball.tartarus.org/license.php
> Would it be possible to include the Snowball source directly and allow
> its execution to be part of the make process for tsearch2? Or have
> configure check for Snowball at make time? At the very least it would be
> good to have a Readme file explaining how to modify the Snowball stemmer
> and regenerate for tsearch2.
>
> That would then encourage people to improve the stemmers, as well as
> allow us to include French and Spanish versions etc..
>
> Perhaps we should ask translators to provide stop word lists for their
> languages. It seems a shame to have docs in so many languages, but no
> language capability for Tsearch2.
>
> Also, why do we have another crc32 implementation in there?
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2006-10-04 09:16:31 Re: workaround for buggy strtod is not necessary
Previous Message Zeugswetter Andreas DCP SD 2006-10-04 07:43:53 Re: PG qsort vs. Solaris