Re: hunspell and tsearch2 ?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dirk Lutzebäck <dirk(dot)lutzebaeck(at)thinkproject(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: hunspell and tsearch2 ?
Date: 2012-08-30 15:39:03
Message-ID: CA+Tgmob3Mr3PznHK0E15yYKX5PB2xmqJcCHN=ffV62akME_qnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 8:31 AM, Dirk Lutzebäck
<dirk(dot)lutzebaeck(at)thinkproject(dot)com> wrote:
> we have issues with compound words in tsearch2 using the german (ispell)
> dictionary. This has been discussed before but there is no real solution
> using the recommended german dictionary at
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2 (convert old
> openoffice dict file to ispell suitable for tsearch):
>
> # select ts_lexize('german_ispell', 'vollklimatisiert');
> ts_lexize
> --------------------
> {vollklimatisiert}
> (1 row)
>
> This should return atleast
>
> {vollklimatisiert, voll, klimatisiert}
>
>
> The issue with compound words in ispell has been addressed in hunspell. But
> this has not been integrated fully to tsearch2 (according to the
> documentation).

Just out of curiosity, which part of the documentation are you looking
at? The only mention of hunspell I see in the documentation is a
mention that we apparently support their dictionary-file format.

> Are there any plans to fully integrate hunspell into tsearch2? What is
> needed to do this? What is the functional delta which is missing? Maybe we
> can help...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-08-30 15:46:59 Re: PATCH: pgbench - random sampling of transaction written into log
Previous Message Robert Haas 2012-08-30 15:18:12 Re: Event Triggers reduced, v1