Re: Fulltext search configuration

From: Mohamed <mohamed5432154321(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Fulltext search configuration
Date: 2009-02-02 11:39:26
Message-ID: 861fed220902020339r6f34b4fchc578966f81ec81f9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

No, I don't. But the ts_lexize don't return anything so I figured there must
be an error somehow.
I think we are using the same dictionary + that I am using the stopwords
file and a different affix file, because using the hunspell (ayaspell) .aff
gives me this error :

ERROR: wrong affix file format for flag
CONTEXT: line 42 of configuration file "C:/Program
Files/PostgreSQL/8.3/share/tsearch_data/hunarabic.affix": "PFX Aa Y 40

/ Moe

On Mon, Feb 2, 2009 at 12:13 PM, Daniel Chiaramello <
daniel(dot)chiaramello(at)golog(dot)net> wrote:

> Hi Mohamed.
>
> I don't know where you get the dictionary - I unsuccessfully tried the
> OpenOffice one by myself (the Ayaspell one), and I had no arabic stopwords
> file.
>
> Renaming the file is supposed to be enough (I did it successfully for
> Thailandese dictionary) - the ".aff'" file becoming the ".affix" one.
> When I tried to create the dictionary:
>
> CREATE TEXT SEARCH DICTIONARY ar_ispell (
> TEMPLATE = ispell,
> DictFile = ar_utf8,
> AffFile = ar_utf8,
> StopWords = english
> );
>
> I had an error:
>
> ERREUR: mauvais format de fichier affixe pour le drapeau
> CONTEXTE : ligne 42 du fichier de configuration «
> /usr/share/pgsql/tsearch_data/ar_utf8.affix » : « PFX Aa Y 40
>
> (which means Bad format of Affix file for flag, line 42 of configuration
> file)
>
> Do you have an error when creating your dictionary?
>
> Daniel
>
> Mohamed a écrit :
>
> I have ran into some problems here.
> I am trying to implement arabic fulltext search on three columns.
>
> To create a dictionary I have a hunspell dictionary and and arabic stop
> file.
>
> CREATE TEXT SEARCH DICTIONARY hunspell_dic (
> TEMPLATE = ispell,
> DictFile = hunarabic,
> AffFile = hunarabic,
> StopWords = arabic
> );
>
>
> 1) The problem is that the hunspell contains a .dic and a .aff file but
> the configuration requeries a .dict and .affix file. I have tried to change
> the endings but with no success.
>
> 2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing
>
> 3) How can I convert my .dic and .aff to valid .dict and .affix ?
>
> 4) I have read that when using dictionaries, if a word is not recognized by
> any dictionary it will not be indexed. I find that troublesome. I would like
> everything but the stop words to be indexed. I guess this might be a step
> that I am not ready for yet, but just wanted to put it out there.
>
>
>
> Also I would like to know how the process of the fulltext search
> implementation looks like, from config to search.
>
> Create dictionary, then a text configuration, add dic to configuration,
> index columns with gin or gist ...
>
> How does a search look like? Does it match against the gin/gist index.
> Have that index been built up using the dictionary/configuration, or is the
> dictionary only used on search frases?
>
> / Moe
>
>
>
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paolo Saudin 2009-02-02 12:24:53 R: R: complex custom aggregate function
Previous Message Daniel Chiaramello 2009-02-02 11:13:05 Re: Fulltext search configuration