Re: Stemming not working with tsearch2() function

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: psql psql <psql(at)unrulymedia(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Stemming not working with tsearch2() function
Date: 2007-04-30 15:51:42
Message-ID: Pine.LNX.4.64.0704301948281.12152@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 30 Apr 2007, psql psql wrote:

> On 4/30/07, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
>>
>> On Mon, 30 Apr 2007, psql psql wrote:
>>
>> > Anyone know why to_tsvector('sausages') might return "sausages" while
>> > to_tsvector('default','sausages') correctly returns "sausag"?
>> >
>> > This is causing me a fairly major headache. I am guessing that the
>> > tsearch2() function used in my trigger is not specifying "default" when
>> > creating the tsvector since the words be put into the vector are not
>> > correctly stemmed (if that is the correct term).
>> >
>> > I figure this may be something to do with locale settings, other info:
>>
>> it'is. Read http://www.sai.msu.su/~megera/wiki/Tsearch_V2_Notes
>
>
> Thanks for the link.
>
> select * from pg_ts_cfg where oid=show_curcfg();
> ts_name | prs_name | locale
> ---------+----------+-------------
> simple | default | en_US.UTF-8
>
>
> That's helped me understand that the default config used by the
> tsearch2() function
> is not 'default' but 'simple' but I still don't understand why 'simple' is
> not working when both default and simple have the same locale set in
> pg_ts_cfg
> (en_US.UTF-8). Am i missing something?

at present, having several configurations matching the same locale leads
to unpredictable results. Leave only one.
In 8.3 we have special flag to mark fts config
which could be selectable as default.
http://www.sai.msu.su/~megera/postgres/fts/doc/fts-cfg.html

>
>>
>> > postgresql version 8.2.4 (upgraded from 8.2.0 by rpm on Fedora Core 6
>> and
>> > prior to that from a 7.x version although i reinstalled tsearch2)
>> >
>> > SELECT * from pg_ts_cfg;
>> > ts_name | prs_name | locale
>> > -----------------+----------+--------------
>> > default_russian | default | ru_RU.KOI8-R
>> > utf8_russian | default | ru_RU.UTF-8
>> > simple | default | en_US.UTF-8
>> > default | default | en_US.UTF-8
>> >
>> >
>> > lc_collate | en_US.UTF-8
>> > lc_ctype | en_US.UTF-8
>> > lc_messages | en_US.UTF-8
>> > lc_monetary | en_US.UTF-8
>> > lc_numeric | en_US.UTF-8
>> > lc_time | en_US.UTF-8
>> >
>>
>> Regards,
>> Oleg
>> ______________________________
>> phone: +007(495)939-16-83, +007(495)939-23-83
>>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-04-30 15:51:51 Re: Server crash on postgresql 8.2.4 with tsearch2
Previous Message Ted Byers 2007-04-30 15:48:26 Re: Temporal Units