Re: Tsearch2 crashes my backend, ouch !

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Listmail <lists(at)peufeu(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Tsearch2 crashes my backend, ouch !
Date: 2007-04-01 18:26:35
Message-ID: Pine.LNX.4.64.0704012223380.12152@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 30 Mar 2007, Listmail wrote:

>
> OK, I've solved my problem... thanks for the hint !
>
> Anyway, just to signal that tsearch2 crashes if SELECT is not granted
> to pg_ts_dict (other tables give a proper error message when not GRANTed).On

I don't understand this. Are sure on this ?
>From prompt in your select examples I see you have superuser's rights
and you have successfully select from pg_ts_dict column.

Oleg

> Fri, 30 Mar 2007 13:20:30 +0200, Listmail <lists(at)peufeu(dot)com> wrote:
>
>>
>> Hello,
>>
>> I have just ditched Gentoo and installed a brand new kubuntu system
>> (was tired of the endless compiles).
>> I have a problem with crashing tsearch2. This appeared both on Gentoo
>> and the brand new kubuntu.
>>
>> I will describe all my install procedure, maybe I'm doing something
>> wrong.
>>
>> Cluster is newly created and empty.
>>
>> initdb was done with UNICODE encoding & locales.
>>
>> # from postgresql.conf
>>
>> # These settings are initialized by initdb -- they might be changed
>> lc_messages = 'fr_FR.UTF-8' # locale for system error
>> message strings
>> lc_monetary = 'fr_FR.UTF-8' # locale for monetary
>> formatting
>> lc_numeric = 'fr_FR.UTF-8' # locale for number
>> formatting
>> lc_time = 'fr_FR.UTF-8' # locale for time
>> formatting
>>
>> peufeu(at)apollo13:~$ locale
>> LANG=fr_FR.UTF-8
>> LC_CTYPE="fr_FR.UTF-8"
>> LC_NUMERIC="fr_FR.UTF-8"
>> etc...
>>
>> First import needed .sql files from contrib and check that the
>> default tsearch2 config works for English
>>
>> $ createdb -U postgres test
>> $ psql -U postgres test <tsearch2.sql and other contribs I use
>> $ psql -U postgres test
>>
>> test=# select lexize( 'en_stem', 'flying' );
>> lexize
>> --------
>> {fli}
>>
>> test=# select to_tsvector('default', 'flying ducks');
>> to_tsvector
>> ------------------
>> 'fli':1 'duck':2
>>
>> OK, seems to work very nicely, now install French.
>> Since this is Kubuntu there is no source, so download source, then :
>>
>> - apply patch_tsearch_snowball_82 from tsearch2 website
>>
>> ./configure --prefix=/usr/lib/postgresql/8.2/
>> --datadir=/usr/share/postgresql/8.2 --enable-nls=fr --with-python
>> cd contrib/tsearch2
>> make
>> cd gendict
>> (copy french stem.c and stem.h from the snowball website)
>> ./config.sh -n fr -s -p french_UTF_8 -i -v -c stem.c -h stem.h -C'Snowball
>> stemmer for French'
>> cd ../../dict_fr
>> make clean && make
>> sudo make install
>>
>> Now we have :
>>
>> /bin/sh ../../config/install-sh -c -m 644 dict_fr.sql
>> '/usr/share/postgresql/8.2/contrib'
>> /bin/sh ../../config/install-sh -c -m 755 libdict_fr.so.0.0
>> '/usr/lib/postgresql/8.2/lib/dict_fr.so'
>>
>> Okay...
>>
>> - download and install UTF8 french dictionaries from
>> http://www.davidgis.fr/download/tsearch2_french_files.zip and put them in
>> contrib directory
>> (the files delivered by debian package ifrench are ISO8859, bleh)
>>
>> - import french shared libs
>> psql -U postgres test < /usr/share/postgresql/8.2/contrib/dict_fr.sql
>>
>> Then :
>>
>> test=# select lexize( 'en_stem', 'flying' );
>> lexize
>> --------
>> {fli}
>>
>> And :
>>
>> test=# select * from pg_ts_dict where dict_name ~ '^(fr|en)';
>> dict_name | dict_init | dict_initoption |
>> dict_lexize | dict_comment
>> -----------+-----------------------+----------------------+---------------------------------------+-----------------------------
>> en_stem | snb_en_init(internal) | contrib/english.stop |
>> snb_lexize(internal,internal,integer) | English Stemmer. Snowball.
>> fr | dinit_fr(internal) | |
>> snb_lexize(internal,internal,integer) | Snowball stemmer for French
>>
>> test=# select lexize( 'fr', 'voyageur' );
>> server closed the connection unexpectedly
>>
>> BLAM ! Try something else :
>>
>> test=# UPDATE pg_ts_dict SET
>> dict_initoption='/usr/share/postgresql/8.2/contrib/french.stop' WHERE
>> dict_name = 'fr';
>> UPDATE 1
>> test=# select lexize( 'fr', 'voyageur' );
>> server closed the connection unexpectedly
>>
>> Try other options :
>>
>> dict_name | fr_ispell
>> dict_init | spell_init(internal)
>> dict_initoption |
>> DictFile="/usr/share/postgresql/8.2/contrib/french.dict",AffFile="/usr/share/postgresql/8.2/contrib/french.aff",StopFile="/usr/share/postgresql/8.2/contrib/french.stop"
>> dict_lexize | spell_lexize(internal,internal,integer)
>> dict_comment |
>>
>> test=# select lexize( 'en_stem', 'traveler' ), lexize( 'fr_ispell',
>> 'voyageur' );
>> -[ RECORD 1 ]-------
>> lexize | {travel}
>> lexize | {voyageuse}
>>
>> Now it works (kinda) but stemming doesn't stem for French (since
>> snowball is out). It should return 'voyage' (=travel) instead of
>> 'voyageuse' (=female traveler)
>> That's now what I want ; i want to use snowball to stem French words.
>>
>> I'm going to make a debug build and try to debug it, but if anyone
>> can help, you're really, really welcome.
>>
>>
>>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-04-01 19:19:14 Re: postgresl for mysql?
Previous Message RPK 2007-04-01 18:04:29 Connecting a sequence with table column