Re: unaccent

From: nngodinh(at)tiscali(dot)it
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unaccent
Date: 2002-09-18 14:28:04
Message-ID: 3D6DC7A40001DB8A@mail-7.tiscalinet.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The txt2txtidx function works fine with unac. The problem is with the trigger:

create trigger txtidxupdate before update or insert on titles for each row
execute procedure tsearch(titleidx, title);

As you know tsearch(titleidx, unac(title)) doesn't work.

>-- Messaggio Originale --
>Date: Wed, 18 Sep 2002 17:04:56 +0300 (GMT)
>From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
>To: nngodinh(at)tiscali(dot)it
>Cc: pgsql-hackers(at)postgresql(dot)org
>Subject: Re: [HACKERS] unaccent
>
>
>On Wed, 18 Sep 2002 nngodinh(at)tiscali(dot)it wrote:
>
>> The best way to use it is quite simple. If you want to index the table
>"titles"
>> and "title" is the field containing the text to be indexed, you can create
>> another unaccented field, for instance "utitle".
>>
>> UPDATE titles SET utitle = unac(title);
>>
>> Of course you can set it up as a trigger function. Then you can use utitle
>> with txt2txtidx and tsearch.
>>
>> Another solution is to generate the txtidx field (i.e. titleidx) directly
>> using unac:
>>
>> UPDATE titles SET titleidx = txt2txtidx(unac(title));
>>
>> But the problem is that I've not succeeded using it with tsearch because
>> (of course) it doesn't allow functions as parameters. So my first idea
>was
>> to integrate unac in tsearch.
>
>what's exactly a problem ?
>UPDATE titles SET titleidx = txt2txtidx(unac(title));
>works fine. Perhaps, you have a problem with query ?
>
>>
>> Bye.
>>
>> >-- Messaggio Originale --
>> >Date: Wed, 18 Sep 2002 15:08:59 +0300 (GMT)
>> >From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
>> >To: nngodinh(at)tiscali(dot)it
>> >Cc: pgsql-hackers(at)postgresql(dot)org
>> >Subject: Re: [HACKERS] unaccent
>> >
>> >
>> >On Wed, 18 Sep 2002 nngodinh(at)tiscali(dot)it wrote:
>> >
>> >> Greetings,
>> >>
>> >> As far as I use the txtidx data structure in conjunction with gist
indexing
>> >> to make a word indexing of a very large UNICODE db, I've implemented
>> a
>> >PostgreSQL
>> >> function that uses libunac to unaccent TEXT fileds.
>> >>
>> >> The resulting text is in UTF-8, but you can modify it in the sources
>> with
>> >> an appropriate value (using iconv charset names).
>> >>
>> >> Get libunac from: http://www.nongnu.org/unac/ (it uses iconv)
>> >>
>> >> Extract the archive, compile it (make). Move pg_unac.so to your postgresql
>> >> shared libraries dir.
>> >>
>> >> Link it in postgresql:
>> >>
>> >> CREATE FUNCTION unac(TEXT) RETURNS TEXT AS 'path_to_pg_unac.so' LANGUAGE
>> >> C;
>> >>
>> >> What about integrating unaccent libraries directly in tsearch? It
is
>> useful
>> >> for french search engines (for instance).
>> >
>> >I think better to have separate module contrib/unac and document using
>> >it with tsearch. Please write us a couple of lines about using
>> >your function and we'll add them into tsearch documentation.
>> >
>> >btw, use palloc instead of malloc in postgresql functions .
>> >
>> >>
>> >> Bye.
>> >>
>> >> Nhan NGO DINH
>> >>
>> >>
>> >> __________________________________________________________________
>> >> Tiscali Ricaricasa
>> >> la prima prepagata per navigare in Internet a meno di un'urbana e
>> >> risparmiare su tutte le tue telefonate. Acquistala on line e non avrai
>> >> nessun costo di attivazione n? di ricarica!
>> >> http://ricaricasaonline.tiscali.it/
>> >>
>> >>
>> >>
>> >>
>> >
>> > Regards,
>> > Oleg
>> >_____________________________________________________________
>> >Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>> >Sternberg Astronomical Institute, Moscow University (Russia)
>> >Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>> >phone: +007(095)939-16-83, +007(095)939-23-83
>> >
>> >
>> >---------------------------(end of broadcast)---------------------------
>> >TIP 3: if posting/reading through Usenet, please send an appropriate
>> >subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>> >message can get through to the mailing list cleanly
>>
>>
>>
>> __________________________________________________________________
>> Tiscali Ricaricasa
>> la prima prepagata per navigare in Internet a meno di un'urbana e
>> risparmiare su tutte le tue telefonate. Acquistala on line e non avrai
>> nessun costo di attivazione n? di ricarica!
>> http://ricaricasaonline.tiscali.it/
>>
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 4: Don't 'kill -9' the postmaster
>>
>
> Regards,
> Oleg
>_____________________________________________________________
>Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>Sternberg Astronomical Institute, Moscow University (Russia)
>Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
>phone: +007(095)939-16-83, +007(095)939-23-83
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org

__________________________________________________________________
Tiscali Ricaricasa
la prima prepagata per navigare in Internet a meno di un'urbana e
risparmiare su tutte le tue telefonate. Acquistala on line e non avrai
nessun costo di attivazione né di ricarica!
http://ricaricasaonline.tiscali.it/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-09-18 14:47:27 Re: [GENERAL] Still big problems with pg_dump!
Previous Message Robert Treat 2002-09-18 14:08:19 Re: Backend crash (long)