Re: Full text index without accents

From: "Fco(dot) Mario Barcala" Rodríguez <lbarcala(at)freeresearch(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Full text index without accents
Date: 2008-07-24 08:46:46
Message-ID: 1216889206.5112.11.camel@tambre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Finally I create a function like:

CREATE OR REPLACE FUNCTION nonsensible (text) RETURNS text AS $$
DECLARE
var1 varchar;
BEGIN
var1=replace($1, 'á', 'a');
var1=replace(var1, 'é', 'e');
var1=replace(var1, 'í', 'i');
var1=replace(var1, 'ó', 'o');
var1=replace(var1, 'ú', 'u');
var1=replace(var1, 'Á', 'A');
var1=replace(var1, 'É', 'E');
var1=replace(var1, 'Í', 'I');
var1=replace(var1, 'Ó', 'O');
var1=replace(var1, 'Ú', 'U');
return var1;
END
$$LANGUAGE plpgsql immutable;

Then, create text indexes, one for sensible queries and other for
unsensible ones:

CREATE INDEX textindex ON document USING
gin(to_tsvector('spanish',text));

CREATE INDEX textindexn ON document USING
gin(to_tsvector('spanish',nonsensible(text)));

And then make a query sensible or unsensible to accents doing:

SELECT id FROM document WHERE to_tsvector('spanish',text) @@
to_tsquery('spanish','word_with_accent');

or:

SELECT id FROM document WHERE to_tsvector('spanish',nonsensible(text))
@@ to_tsquery('spanish',nonsensible('word_with_accent'));
respectively.

I think postgreSQL uses both indexes as necessary. I believe to remember
reading something about it in the documentation.

Thank you very much,

Mario Barcala

> Here is an example
>
> CREATE FUNCTION dropatsymbol(text) RETURNS text
> AS 'select replace($1, ''@'', '' '');'
> LANGUAGE SQL;
>
> arxiv=# select to_tsvector('english',dropatsymbol('oleg(at)sai(dot)msu(dot)su'));
> to_tsvector
> -------------------------
> 'oleg':1 'sai.msu.su':2

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message admin 2008-07-24 09:41:36 php + postgresql
Previous Message Karsten Hilbert 2008-07-24 07:19:36 Re: pg_query transaction: auto rollback? begin or start?? commit or end???