Re: Fulltext search configuration

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Mohamed <mohamed5432154321(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Fulltext search configuration
Date: 2009-02-02 13:41:53
Message-ID: Pine.LNX.4.64.0902021641180.4158@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mohamed,

We are looking on the problem.

Oleg
On Mon, 2 Feb 2009, Mohamed wrote:

> No, I don't. But the ts_lexize don't return anything so I figured there must
> be an error somehow.
> I think we are using the same dictionary + that I am using the stopwords
> file and a different affix file, because using the hunspell (ayaspell) .aff
> gives me this error :
>
> ERROR: wrong affix file format for flag
> CONTEXT: line 42 of configuration file "C:/Program
> Files/PostgreSQL/8.3/share/tsearch_data/hunarabic.affix": "PFX Aa Y 40
>
> / Moe
>
>
>
>
> On Mon, Feb 2, 2009 at 12:13 PM, Daniel Chiaramello <
> daniel(dot)chiaramello(at)golog(dot)net> wrote:
>
>> Hi Mohamed.
>>
>> I don't know where you get the dictionary - I unsuccessfully tried the
>> OpenOffice one by myself (the Ayaspell one), and I had no arabic stopwords
>> file.
>>
>> Renaming the file is supposed to be enough (I did it successfully for
>> Thailandese dictionary) - the ".aff'" file becoming the ".affix" one.
>> When I tried to create the dictionary:
>>
>> CREATE TEXT SEARCH DICTIONARY ar_ispell (
>> TEMPLATE = ispell,
>> DictFile = ar_utf8,
>> AffFile = ar_utf8,
>> StopWords = english
>> );
>>
>> I had an error:
>>
>> ERREUR: mauvais format de fichier affixe pour le drapeau
>> CONTEXTE : ligne 42 du fichier de configuration ?
>> /usr/share/pgsql/tsearch_data/ar_utf8.affix ? : ? PFX Aa Y 40
>>
>> (which means Bad format of Affix file for flag, line 42 of configuration
>> file)
>>
>> Do you have an error when creating your dictionary?
>>
>> Daniel
>>
>> Mohamed a ?crit :
>>
>> I have ran into some problems here.
>> I am trying to implement arabic fulltext search on three columns.
>>
>> To create a dictionary I have a hunspell dictionary and and arabic stop
>> file.
>>
>> CREATE TEXT SEARCH DICTIONARY hunspell_dic (
>> TEMPLATE = ispell,
>> DictFile = hunarabic,
>> AffFile = hunarabic,
>> StopWords = arabic
>> );
>>
>>
>> 1) The problem is that the hunspell contains a .dic and a .aff file but
>> the configuration requeries a .dict and .affix file. I have tried to change
>> the endings but with no success.
>>
>> 2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing
>>
>> 3) How can I convert my .dic and .aff to valid .dict and .affix ?
>>
>> 4) I have read that when using dictionaries, if a word is not recognized by
>> any dictionary it will not be indexed. I find that troublesome. I would like
>> everything but the stop words to be indexed. I guess this might be a step
>> that I am not ready for yet, but just wanted to put it out there.
>>
>>
>>
>> Also I would like to know how the process of the fulltext search
>> implementation looks like, from config to search.
>>
>> Create dictionary, then a text configuration, add dic to configuration,
>> index columns with gin or gist ...
>>
>> How does a search look like? Does it match against the gin/gist index.
>> Have that index been built up using the dictionary/configuration, or is the
>> dictionary only used on search frases?
>>
>> / Moe
>>
>>
>>
>>
>>
>>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2009-02-02 13:58:05 Re: urgent request : PSQLException: FATAL: could not open relation XXX: No such file or directory
Previous Message Paolo Saudin 2009-02-02 12:24:53 R: R: complex custom aggregate function