From: | "Jonathan Bond-Caron" <jbondc(at)openmv(dot)com> |
---|---|
To: | 'Fco(dot) Mario Barcala Rodríguez' <lbarcala(at)freeresearch(dot)org>, <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Full text index without accents |
Date: | 2008-07-24 14:53:46 |
Message-ID: | 005d01c8ed9d$138fd1e0$3aaf75a0$@com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
This would probably help:
CREATE OR REPLACE FUNCTION norm_text_latin(character varying)
RETURNS character varying AS
$BODY$
declare
p_str alias for $1;
v_str varchar;
begin
select translate(p_str, 'ÀÁÂÃÄÅ', 'AAAAAA') into v_str;
select translate(v_str, 'ÉÈËÊ', 'EEEE') into v_str;
select translate(v_str, 'ÌÍÎÏ', 'IIII') into v_str;
select translate(v_str, 'ÌÍÎÏ', 'IIII') into v_str;
select translate(v_str, 'ÒÓÔÕÖ', 'OOOOO') into v_str;
select translate(v_str, 'ÙÚÛÜ', 'UUUU') into v_str;
select translate(v_str, 'àáâãäå', 'aaaaaa') into v_str;
select translate(v_str, 'èéêë', 'eeee') into v_str;
select translate(v_str, 'ìíîï', 'iiii') into v_str;
select translate(v_str, 'òóôõö', 'ooooo') into v_str;
select translate(v_str, 'ùúûü', 'uuuu') into v_str;
select translate(v_str, 'Çç', 'Cc') into v_str;
return v_str;
end;$BODY$
LANGUAGE 'plpgsql' VOLATILE;
There's also o useful functions here:
http://www.project-open.org/doc/intranet-search-pg/intranet-search-pg-create.sql
-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Fco. Mario Barcala Rodríguez
Sent: July 24, 2008 4:47 AM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Full text index without accents
Finally I create a function like:
CREATE OR REPLACE FUNCTION nonsensible (text) RETURNS text AS $$
DECLARE
var1 varchar;
BEGIN
var1=replace($1, 'á', 'a');
var1=replace(var1, 'é', 'e');
var1=replace(var1, 'í', 'i');
var1=replace(var1, 'ó', 'o');
var1=replace(var1, 'ú', 'u');
var1=replace(var1, 'Á', 'A');
var1=replace(var1, 'É', 'E');
var1=replace(var1, 'Í', 'I');
var1=replace(var1, 'Ó', 'O');
var1=replace(var1, 'Ú', 'U');
return var1;
END
$$LANGUAGE plpgsql immutable;
Then, create text indexes, one for sensible queries and other for
unsensible ones:
CREATE INDEX textindex ON document USING
gin(to_tsvector('spanish',text));
CREATE INDEX textindexn ON document USING
gin(to_tsvector('spanish',nonsensible(text)));
And then make a query sensible or unsensible to accents doing:
SELECT id FROM document WHERE to_tsvector('spanish',text) @@
to_tsquery('spanish','word_with_accent');
or:
SELECT id FROM document WHERE to_tsvector('spanish',nonsensible(text))
@@ to_tsquery('spanish',nonsensible('word_with_accent'));
respectively.
I think postgreSQL uses both indexes as necessary. I believe to remember
reading something about it in the documentation.
Thank you very much,
Mario Barcala
> Here is an example
>
> CREATE FUNCTION dropatsymbol(text) RETURNS text
> AS 'select replace($1, ''@'', '' '');'
> LANGUAGE SQL;
>
> arxiv=# select to_tsvector('english',dropatsymbol('oleg(at)sai(dot)msu(dot)su'));
> to_tsvector
> -------------------------
> 'oleg':1 'sai.msu.su':2
From | Date | Subject | |
---|---|---|---|
Next Message | Shane Ambler | 2008-07-24 14:57:15 | Re: mac install question |
Previous Message | Scott Marlowe | 2008-07-24 14:52:44 | Re: php + postgresql |