| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | Hugh Ranalli <hugh(at)whtc(dot)ca> | 
| Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-bugs(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: BUG #15548: Unaccent does not remove combining diacritical characters | 
| Date: | 2018-12-14 22:50:03 | 
| Message-ID: | 16726.1544827803@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs pgsql-hackers | 
Hugh Ranalli <hugh(at)whtc(dot)ca> writes:
> I've attached a patch removes combining diacriticals. As with Latin and
> Greek letters, it uses ranges to restrict its activity.
Cool.  Please add it to the current CF so we don't forget about it:
https://commitfest.postgresql.org/21/
> I have not submitted a patch for unaccent.rules, as it seems that a rules
> file generated from generate_unaccent_rules.py will actually remove a large
> number of rules (even before my changes), such as replacing the copyright
> symbol © with (C), as well as other accented characters. It's probably
> worth asking if the shipped unaccent.rules should correspond to what the
> shipped generation utility produces, or not. I was surprised to see that it
> didn't.
Me too -- seems like that bears looking into.  Perhaps the script's
results are platform dependent -- what were you testing on?
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hugh Ranalli | 2018-12-15 18:08:00 | Re: BUG #15548: Unaccent does not remove combining diacritical characters | 
| Previous Message | Hugh Ranalli | 2018-12-14 22:42:05 | Re: BUG #15548: Unaccent does not remove combining diacritical characters | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2018-12-14 23:00:32 | Re: Catalog views failed to show partitioned table information. | 
| Previous Message | Hugh Ranalli | 2018-12-14 22:42:05 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |