Re: BUG #15548: Unaccent does not remove combining diacritical characters

From: Hugh Ranalli <hugh(at)whtc(dot)ca>
To: raam narayana <raam(dot)soft(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15548: Unaccent does not remove combining diacritical characters
Date: 2019-02-11 19:20:42
Message-ID: CAAhbUMODj1cCHjCpZ-=kxJxnVWyTsqu6ZnWe8+gCsb5SGnv=zA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sun, 10 Feb 2019 at 15:07, raam narayana <raam(dot)soft(at)gmail(dot)com> wrote:

> Hi,
>
> After the latest commit in master branch, I was trying to test the python
> script. Ironically I still see that the output from the script is
> completely different from the unaccent.rules file content. Am I missing
> anything.My testing includes the following
>
> Downloaded the following files
>
> http://unicode.org/Public/8.0.0/ucd/UnicodeData.txt
>
>
> http://unicode.org/cldr/trac/export/14746/tags/release-34/common/transforms/Latin-ASCII.xml
>
> Executed the below python script
>
> python generate_unaccent_rules.py --unicode-data-file UnicodeData.txt
> --latin-ascii-file Latin-ASCII.xml > unaccent.rules
>
> I am using python 3.7.1 and running on Windows 10 Platform
>
> The new status of this patch is: Needs review
>

Hi Raam,
I just ran generate_unaccent_rules.py under two environments, using the
data files given above :
- Python 3.4.3 on Linux Mint 17.3 (equivalent to Ubuntu 14.04)
- Python 3.6.7 on Ubuntu 18.04

In both cases, the output was identical to that generated by the program
under Python 2.7. So yes, more information would help. Unfortunately I
don't have a Windows Python environment readily available, but could set
one up if I had to.

Thanks,
Hugh

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-02-11 20:33:08 Re: BUG #15631: Generated as identity field in a temporary table with on commit drop corrupts system catalogs
Previous Message David G. Johnston 2019-02-11 19:03:58 Re: BUG #15632: Correctly escaped strings are mishandled in function

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-11 19:46:43 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)
Previous Message Alvaro Herrera 2019-02-11 18:58:01 Re: PG_RE_THROW is mandatory (was Re: jsonpath)