Re: BUG #15548: Unaccent does not remove combining diacritical characters

From: raam narayana <raam(dot)soft(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Hugh Ranalli <hugh(at)whtc(dot)ca>
Subject: Re: BUG #15548: Unaccent does not remove combining diacritical characters
Date: 2019-02-10 20:06:25
Message-ID: 154982918542.11785.1374991294537224097.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi,

After the latest commit in master branch, I was trying to test the python script. Ironically I still see that the output from the script is completely different from the unaccent.rules file content. Am I missing anything.My testing includes the following

Downloaded the following files

http://unicode.org/Public/8.0.0/ucd/UnicodeData.txt

http://unicode.org/cldr/trac/export/14746/tags/release-34/common/transforms/Latin-ASCII.xml

Executed the below python script

python generate_unaccent_rules.py --unicode-data-file UnicodeData.txt --latin-ascii-file Latin-ASCII.xml > unaccent.rules

I am using python 3.7.1 and running on Windows 10 Platform

The new status of this patch is: Needs review

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2019-02-10 20:44:01 Re: BUG #15548: Unaccent does not remove combining diacritical characters
Previous Message Tom Lane 2019-02-10 17:03:07 Re: BUG #15627: libpq memory leak

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-02-10 20:44:01 Re: BUG #15548: Unaccent does not remove combining diacritical characters
Previous Message Peter Geoghegan 2019-02-10 19:53:58 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)