BUG #14278: Problem searching spanish words with accent mark outside the stem

From: paco(at)hernandezgomez(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14278: Problem searching spanish words with accent mark outside the stem
Date: 2016-08-04 10:25:24
Message-ID: 20160804102524.1430.90715@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14278
Logged by: Paco Hernández
Email address: paco(at)hernandezgomez(dot)com
PostgreSQL version: 9.6beta3
Operating system: Linux
Description:

Dear sirs:

Search without accent mark is not working correctly when the accent mark is
outside the stem of the word.

For example, this matches correctly:

postgres=# select to_tsvector('spanish', 'canción') @@ to_tsquery('spanish',
'cancion');
?column?
----------
t
(1 row)

This works and returns true because the stem of "canción" is "cancion", so
when we search for "cancion" (without accent mark), it matches correctly.

But, when the accent mark is outside the stem, for example in "peluquería",
then it does not work because the stem of "peluquería" is "peluqu", but
to_tsquery('spanish', 'peluqueria') is "peluqueri".

postgres=# select to_tsvector('spanish', 'peluquería') @@
to_tsquery('spanish', 'peluqueria');
?column?
----------
f
(1 row)

This is important because there are many people that don't use the accent
mark at letter "i" in "peluquería" and words like that.

Thank you very much.

Best regards,
Paco Hernández.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message klimych 2016-08-04 12:55:36 Re: BUG #14275: cursor's variable in pgsql doesn't respect scope
Previous Message digoal 2016-08-04 07:58:08 BUG #14277: when define USE_NAMED_POSIX_SEMAPHORES , make error