Re: TSEARCH2: disable stemming in indexes and triggers

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Erwin Moller <erwin(at)darwine(dot)nl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: TSEARCH2: disable stemming in indexes and triggers
Date: 2007-05-31 18:23:21
Message-ID: Pine.LNX.4.64.0705312217170.12152@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 31 May 2007, Erwin Moller wrote:

> Hi all,
>
> I installed TSEARCH2 on Postgres8.1 (Debian).
> It runs all fine.
> I have 2 tables indexed, and created triggers to keep the vectorcolumns
> up
> to date.
>
>
> However, the text I indexed is a mix of Dutch and English and German.
> The default stemmingprocess is an annoyance for me.
> I would like to disable it.
> I found out that using 'simple' instead of 'default' when using
> to_tsvector() does excactly that, but I don't know how to change my
> triggers and indexes to keep doing the same (using 'simple').
>
>
> My TSEARCH2 and trigger-skills are developing, but I lack the confidence
> to
> change thing on the site (which is live), so I thought I check in here.
>
>
> I have a table named tblvacature, that contains a bunch of columns that
> need
> to be indexed by TSEARCH.
>
>
> This is what I did (and it works, except for the fact I want to disable
> stemming)
>
>
> 1) ALTER TABLE tblvacature ADD COLUMN idxFTI tsvector;
> 2) UPDATE tblvacature SET idxFTI=to_tsvector('simple',
> coalesce(title,'') ||' '||
> coalesce(shortintro,'') ||' '||
> coalesce(werkgever,'') ||' '||
> coalesce(vacaturesteller,'') ||' '||
> coalesce(standplaats,'') ||' '||
> coalesce(divafdelingwerkgever,'') );
>
>
> 3) VACUUM FULL ANALYZE;
>
>
> 4) CREATE INDEX idxFTIvacture_idx ON tblvacature USING gist(idxFTI);
>
>
> 5) VACUUM FULL ANALYZE;
>
>
> 6) CREATE TRIGGER tsvectorupdate_vacature BEFORE UPDATE OR INSERT ON
> tblvacature FOR EACH ROW EXECUTE
> PROCEDURE tsearch2(idxFTI, title, shortintro, werkgever,
> vacaturesteller,
> standplaats, divafdelingwerkgever);
>
>
> Step 2 worked fine: no stemming.
> But how do I cange my step6 (trigger) so it keeps using 'simple'?
> Or do I have to change the index itself (step4) too? Or both?
>
>
> Please advise.

tsearch trigger is just an example, you could write your own function !
for example,

create function my_update() returns trigger as
$$
BEGIN
NEW.idxFTI=to_tsvector('simple',
coalesce(title,'') ||' '||
coalesce(shortintro,'') ||' '||
coalesce(werkgever,'') ||' '||
coalesce(vacaturesteller,'') ||' '||
coalesce(standplaats,'') ||' '||
coalesce(divafdelingwerkgever,'') );
RETURN NEW;
END;
$$
language plpgsql;

btw, why do you don't use setweight() function to assign different weights to
the different attributes ?

>
>
> Thanks for your time.
>
>
> Regards,
> Erwin Moller
>
> -----------------------
> PS: This is my first posting to pgsql-general: if I screw something up,
> please let me know.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Teodor Sigaev 2007-05-31 18:27:06 Re: TSEARCH2: disable stemming in indexes and triggers
Previous Message Joshua D. Drake 2007-05-31 17:46:34 Re: Design Table & Search Question