From: | Hugh Ranalli <hugh(at)whtc(dot)ca> |
---|---|
To: | thomas(dot)munro(at)enterprisedb(dot)com |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Date: | 2018-12-17 20:22:37 |
Message-ID: | CAAhbUMOX4QLj6c0O3GnjZYtR2dpAowss832Bq1n7oJyByeR7kQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Sat, 15 Dec 2018 at 21:26, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:
> +1 for updating to the latest file from time to time. After
> http://unicode.org/cldr/trac/ticket/11383 makes it into a new release,
> our special_cases() function will have just the two Cyrillic
> characters, which should almost certainly be handled by adding
> Cyrillic to the ranges we handle via the usual code path, and DEGREE
> CELSIUS and DEGREE FAHRENHEIT. Those degree signs could possibly be
> extracted from Unicode.txt (or we could just forget about them), and
> then we could drop special_cases().
>
Well, when I modified the code to handle the new version of the
transliteration file, I discovered that was sufficient to handle the old
version as well. That's not the way things usually go, but I'll take it. ;-)
I've attached two patches, one to update generate_unaccent_rules.py, and
another that updates unaccent.rules from the v34 transliteration file. I'll
be happy to add these to the CF. Does anyone need to review them and give
me approval before I do so?
Best wishes,
Hugh
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-12-17 20:26:16 | Re: BUG #15555: Syntax errors when using the COMMENT command in plpgsql and a "comment" variable |
Previous Message | Peter Geoghegan | 2018-12-17 19:29:14 | Re: BUG #15556: Duplicate key violations even when using ON CONFLICT DO UPDATE |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-12-17 20:31:07 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |
Previous Message | Peter Geoghegan | 2018-12-17 19:48:16 | Re: gist microvacuum doesn't appear to care about hot standby? |