Re: What is the simpliest text search configuration?

From: Michael Nacos <m(dot)nacos(at)gmail(dot)com>
To: jerome(dot)eteve(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: What is the simpliest text search configuration?
Date: 2009-11-12 13:40:59
Message-ID: 407fa4640911120540x57565291r197d655f0f228bce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear Jerome,

from personal experience full-text searching in PostgreSQL can be quite
powerful
but it's not simple, it requires thought, planning and coding. PostgreSQL
mainly
provides an efficient token matching mechanism supporting positional
information
and weights, but natural language processing and normalization is pretty
basic.

If you don't mind writing a couple of user-defined functions to take control
of lexeme
normalization, then tsvector/tsquery support can be a very powerful tool for
custom
search engines.

regards,

Michael

2009/11/12 Jérôme Etévé <jerome(dot)eteve(at)gmail(dot)com>

> Hi all,
>
> I'd like to implement a full text search with postgresql, and I can't find
> a text search configuration that would just:
>
> map unicode accentuated letters to an un-accentuated equivalent
> tokenize the words (and skip any non word characters)
> no stopwords
> lower case the tokens
>
> How can I achieve this? I'm particularly interested in deactivating
> the stopwords filtering.
>
> I tried pg_catalog.simple, but despite its name, it still considers stop
> words.
>
> Thanks for your help!
>
> Jerome.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message A. Kretschmer 2009-11-12 13:41:32 re-using RETURNING
Previous Message Sam Jas 2009-11-12 13:16:57 Re: DB Restart