Re: FTS Configuration option

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FTS Configuration option
Date: 2016-10-12 12:08:25
Message-ID: CAE2gYzzdqjeCpPk-BU1AWkFsn1yZocBWpAYd1WLA-EFy13Ozgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> => ALTER TEXT SEARCH CONFIGURATION multi_conf
> ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
> word, hword, hword_part
> WITH german_ispell (JOIN), english_ispell, simple;

I have something like this in my mind since I dealt with FTS for a
Turkish real estate listing application. Being able to pipe output of
some dictionaries is a nice feature we have since 9.0, but it is not
always sufficient. I think it is wrong to decide this per dictionary
bases. Something slightly more complicated to connect dictionaries
parallel or serial to each other might be more useful.

My problem was related to the special characters on Turkish (ç, ğ, ı,
ö, ü). It is very common to just type 7-bit-close-looking characters
(c, g, i, o, u) instead of those. Unaccent extension changes them as
desired, and passes the altered words to the subsequent dictionary,
when this configuration is changed like this:

> ALTER TEXT SEARCH CONFIGURATION turkish
> ALTER MAPPING FOR word, hword, hword_part
> WITH unaccent, turkish_stem;

However then the stemmer doesn't do a good job on those words, because
the changed characters are important for the language. What I really
needed was something like this:

> ALTER TEXT SEARCH CONFIGURATION turkish
> ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part
> WITH (fix_mistyped_characters AND (turkish_hunspell OR turkish_stem) AND unaccent);

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-10-12 12:32:52 Re: macaddr 64 bit (EUI-64) datatype support
Previous Message Shay Rojansky 2016-10-12 11:51:47 Re: PATCH: Batch/pipelining support for libpq