Re: BUG #13440: unaccent does not remove all diacritics

From: Léonard Benedetti <benedetti(at)mlpo(dot)fr>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13440: unaccent does not remove all diacritics
Date: 2016-03-15 17:44:57
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

15/03/2016 18:01, Teodor Sigaev wrote:
>> So I think we can keep just a version for Python 2 for now. If everyone
>> agrees, I'll update the files and patch.
> Attached patch is a my try to make script works for both 2 & 3
> versions of Python. At least it produces the same result for 2.7 and
> 3.4. Pls, could you check? I'm not a Python developer at all.
> BTW, I revomed unicode characted from code and leaved it only in
> comments.
Unfortunately, this script is not functional: the characters managed by
“parse_cldr_latin_ascii_transliterator” are absent from output. It is
probably a compatibility problem with the regex (the two versions of the
language are not compatible, it is not always possible to write a code
that works with both).

After the various feedbacks, and since: the PostgreSQL source uses only
Python 2, the end of support for this version will not happen soon, and
mostly this script must be run very rarely (only when the Unicode
Standard is updated, or transliterator, it is not part of the build
process), the easiest way seems to be to have a single Python 2 script.

So, you will find attached a new patch, it’s the same script, compatible
with Python 2, *with only ASCII characters*.


Léonard Benedetti

Attachment Content-Type Size
improve-unaccent-default-rules-generation-script-v5.patch text/x-patch 15.5 KB

In response to


Browse pgsql-bugs by date

  From Date Subject
Next Message Robins Tharakan 2016-03-15 19:18:47 pgbench -C -M prepared gives an error
Previous Message Michael Paquier 2016-03-15 17:11:01 Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby