Re: Flexible configuration for full-text search

From: Aleksandr Parfenov <a(dot)parfenov(at)postgrespro(dot)ru>
To: Emre Hasegeli <emre(at)hasegeli(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Subject: Re: Flexible configuration for full-text search
Date: 2017-11-07 09:48:38
Message-ID: 20171107124838.078e96cc@asp437-24-g082ur
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 31 Oct 2017 09:47:57 +0100
Emre Hasegeli <emre(at)hasegeli(dot)com> wrote:

> > If we want to save this behavior, we should somehow pass a stopword
> > to tsvector composition function (parsetext in ts_parse.c) for
> > counter increment or increment it in another way. Currently, an
> > empty lexemes array is passed as a result of LexizeExec.
> >
> > One of possible way to do so is something like:
> > CASE polish_stopword
> > WHEN MATCH THEN KEEP -- stopword counting
> > ELSE polish_isspell
> > END
>
> This would mean keeping the stopwords. What we want is
>
> CASE polish_stopword -- stopword counting
> WHEN NO MATCH THEN polish_isspell
> END
>
> Do you think it is possible?

Hi Emre,

I thought how it can be implemented. The way I see is to increment
word counter in case if any chcked dictionary matched the word even
without returning lexeme. Main drawback is that counter increment is
implicit.

--
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-11-07 10:54:08 Re: [PATCH] Improve geometric types
Previous Message Amit Kapila 2017-11-07 09:45:01 Re: parallelize queries containing initplans