From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | raam narayana <raam(dot)soft(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Hugh Ranalli <hugh(at)whtc(dot)ca> |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2019-02-10 20:44:01 |
Message-ID: | CAEepm=3GtcMM3+_DEAmM5X=xtDwVo7C9mPTY04vkLCmQoT6zCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Mon, Feb 11, 2019 at 7:07 AM raam narayana <raam(dot)soft(at)gmail(dot)com> wrote:
> After the latest commit in master branch, I was trying to test the python script. Ironically I still see that the output from the script is completely different from the unaccent.rules file content. Am I missing anything.My testing includes the following
>
> Downloaded the following files
>
> http://unicode.org/Public/8.0.0/ucd/UnicodeData.txt
>
> http://unicode.org/cldr/trac/export/14746/tags/release-34/common/transforms/Latin-ASCII.xml
>
> Executed the below python script
>
> python generate_unaccent_rules.py --unicode-data-file UnicodeData.txt --latin-ascii-file Latin-ASCII.xml > unaccent.rules
>
> I am using python 3.7.1 and running on Windows 10 Platform
>
> The new status of this patch is: Needs review
Hi Raam,
How does it differ? Can you please share the output you get? I used
Python 2.7 on a Mac, exactly those input files, and my output matched
Hugh's.
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Sergey Ivanov | 2019-02-10 21:20:32 | Re: BUG #15627: libpq memory leak |
Previous Message | raam narayana | 2019-02-10 20:06:25 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2019-02-10 20:47:56 | Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages) |
Previous Message | raam narayana | 2019-02-10 20:06:25 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |