From: | "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> |
Cc: | "Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: integrated tsearch has different results than tsearch2 |
Date: | 2007-09-04 11:52:23 |
Message-ID: | 162867790709040452o4f0f2558m37adb4219b3e7ed6@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I used dictionaries from fedora core packages
hunspell-cs-20060303-5.fc7.i386.rpm
then I converted it to utf8 with iconv
Pavel
2007/9/4, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>:
> Pavel Stehule wrote:
> > 2007/9/3, Teodor Sigaev <teodor(at)sigaev(dot)ru>:
> >>> 1. I am not able use fulltext with latin2 encoding :( I missing note
> >>> about only utf8 dictionaries in doc).
> >> You can use any server encoding, but dictionary's files should be in utf8 -
> >> dictionary will convert utf8 files into server encoding.
> >>
> >>>
> >>> 2. with hspell dictionaries (fresh copy from open office) I got
> >>> different and wrong results.
> >>> postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté
> >>> vody') @@ to_tsquery('cs','napít');
> >>> ?column?
> >>> ----------
> >>> f
> >>> (1 row)
> >> Pls, output of:
> >> select ts_lexize('cspell','napil');
> >> select to_tsvector('cs','Příliš žlutý kůň se napil žluté
> >> vody');
> >>
> >>
> > postgres=# select ts_lexize('cspell','napil');
> > ts_lexize
> > -----------
> >
> > (1 row)
> > postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté vody');
> > to_tsvector
> > -----------------------------------------------------------
> > 'vody':7 'kůň':3 'napil':5 'žluté':6 'žlutý':2 'příliš':1
> > (1 row)
> >
> > There is difference
> > 8.2.x
> > postgres=# select lexize('cz_ispell','jablka');
> > lexize
> > ----------
> > {jablko}
> > (1 row)
> > 8.3
> > postgres=# select ts_lexize('cspell','jablka');
> > ts_lexize
> > -----------
> >
> > (1 row)
> > postgres=# select ts_lexize('cspell','jablko');
> > ts_lexize
> > -----------
> > {jablko}
> > (1 row)
>
> Can you post a link to the ispell dictionary file you're using so I and
> others can reproduce that?
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2007-09-04 12:35:08 | Re: integrated tsearch has different results than tsearch2 |
Previous Message | Heikki Linnakangas | 2007-09-04 11:14:02 | Re: integrated tsearch has different results than tsearch2 |