Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: shailesh(dot)totale(at)sailpoint(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18216: Unaccent function is unable to remove accents (diacritic signs) from Japanese character 'ド'
Date: 2023-11-28 14:58:35
Message-ID: 4143551.1701183515@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> PostgreSQL's unaccent module does not use Unicode normalisation, but only a
> simple search-and-replace dictionary. The dictionary, unaccent.rules
> (https://github.com/postgres/postgres/blob/master/contrib/unaccent/unaccent.rules)
> , does not contain these Japanese characters, thus its unable to remove
> the diacritic signs. Can someone please guide when we can expect these
> Japanese characters will be added.

unaccent.rules, as distributed, is just an example. It is not meant
to be exhaustive or authoritative. Feel free to add your own entries
to your copy.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2023-11-29 00:48:10 Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower
Previous Message Sri Mrudula Attili 2023-11-28 14:57:52 Re: Could not read from file "pg_subtrans/00F5" at offset 122880: Success.