Re: BUG #13440: unaccent does not remove all diacritics

From: Michael Gradek <mike(at)busbud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13440: unaccent does not remove all diacritics
Date: 2015-06-15 04:02:28
Message-ID: CAEP8ZNWBH7Lc8KeaAxbPFJAC-RUUCKfBv5xcfOjqKvf4309Esw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Tom,

Thanks for looking into this issue. Would this help?

> psql -l

List of databases

Name | Owner | Encoding | Collate |
Ctype | Access privileges

------------------------+---------------+----------+-------------+-------------+---------------------------------

grand-central | michaelgradek | UTF8 | en_US.UTF-8 |
en_US.UTF-8 |

Here's a case showing the transformation failing, and another succeeding

> psql grand-central

psql (9.4.1, server 9.3.5)

Type "help" for help.

grand-central=# select 'ț' as input, unaccent('ț') as observed, 't' as
expected;

input | observed | expected

-------+----------+----------

ț | ț | t

(1 row)

grand-central=# select 'é' as input, unaccent('é') as observed, 'e' as
expected;

input | observed | expected

-------+----------+----------

é | e | e

(1 row)

On Sun, Jun 14, 2015 at 1:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> mike(at)busbud(dot)com writes:
> > Sorry, I couldn't install the most recent minor release, but I did try
> this
> > on several different versions. I used Heroku to try a 9.4.3 build, and
> got
> > the same results
>
> > select 'ț' as input, unaccent('ț') as observed, 't' as expected;
> > input | observed | expected
> > -------+----------+----------
> > ț | ț | t
> > (1 row)
>
> Hm, I do see
>
> ţ t
>
> in unaccent.rules, so the transformation ought to happen. I suspect
> an encoding issue, eg your terminal window is not transmitting characters
> in the encoding Postgres thinks you're using. You did not provide any
> info about server encoding, client encoding, or client LC_xxx environment,
> so it's hard to debug from here.
>
> regards, tom lane
>

--
Cheers,
Mike
--
Mike Gradek
Co-founder and CTO, Busbud
Busbud.com <http://busbud.com/> | mike(at)busbud(dot)com
*We're hiring!: Jobs at Busbud <http://www.busbud.com/en/about/jobs>*

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2015-06-15 04:47:01 Re: BUG #13440: unaccent does not remove all diacritics
Previous Message Michael Paquier 2015-06-14 22:41:09 Re: BUG #13441: pg_settings.pending_restart cann't reflect changed setting in configure file.