Re: BUG #15548: Unaccent does not remove combining diacritical characters

From: Hugh Ranalli <hugh(at)whtc(dot)ca>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15548: Unaccent does not remove combining diacritical characters
Date: 2018-12-13 18:50:37
Message-ID: CAAhbUMOHkoN3Jeti4dp1jz3VY=XZPcCqpX=sW=mgmJbdMS--ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, 13 Dec 2018, 11:26 Daniel Verite <daniel(at)manitou-mail(dot)org wrote:

> Tom Lane wrote:
>
> > Hm, I thought the OP's proposal was just to make unaccent drop
> > combining diacriticals independently of context, which'd avoid the
> > combinatorial-growth problem.
>

That's what I was thinking. Given that the accent is separate from the
characters, simply dropping it should result in the correct unaccented
character.

>
> In that case, this could be achieved by simply appending the
> diacriticals themselves to unaccent.rules, since replacement of a
> string by an empty string is already supported as a rule.
> It doesn't seem like the current file has any of these, but from
> https://www.postgresql.org/docs/11/unaccent.html :
>
> "Alternatively, if only one character is given on a line, instances
> of that character are deleted; this is useful in languages where
> accents are represented by separate characters"
>

Yes, I had read that in the docs, and that's the approach I planned to
take. I'll go ahead and develop a patch, then.

Best wishes,
Hugh

>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Stuart 2018-12-13 22:11:43 Re: Errors creating partitioned tables from existing using (LIKE
Previous Message Daniel Verite 2018-12-13 16:26:48 Re: BUG #15548: Unaccent does not remove combining diacritical characters

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-12-13 18:59:59 Re: Remove Deprecated Exclusive Backup Mode
Previous Message David Steele 2018-12-13 18:45:30 Re: Change pgarch_readyXlog() to return .history files first