Re: updating unaccent.rules for Arabic letters

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: kerbrose khaled <kerbrose(at)hotmail(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: updating unaccent.rules for Arabic letters
Date: 2019-11-03 16:12:15
Message-ID: 5527.1572797535@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-translators

kerbrose khaled <kerbrose(at)hotmail(dot)com> writes:
> I would like to update unaccent.rules file to support Arabic letters. so could someone help me or tell me how could I add such contribution. I attached the file including the modifications, only the last 4 lines.

Hi! I've got no objection to including Arabic in the set of covered
languages, but handing us a new unaccent.rules file isn't the way to
do it, because that's a generated file. The adjacent script
generate_unaccent_rules.py generates it from the official Unicode
source data (see comments in that script). What we need, ultimately,
is a patch to that script so it will emit these additional translations.
Past commits that might be useful sources of inspiration include

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=456e3718e7b72efe4d2639437fcbca2e4ad83099
https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=5e8d670c313531c0dca245943fb84c94a477ddc4
https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=ec0a69e49bf41a37b5c2d6f6be66d8abae00ee05

If you're not good with Python, maybe you could just explain to us
how to recognize these characters from Unicode character properties.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-11-03 16:27:17 Re: [HACKERS] proposal: schema variables
Previous Message Tom Lane 2019-11-03 15:58:30 Re: [PATCH] contrib/seg: Fix PG_GETARG_SEG_P definition

Browse pgsql-translators by date

  From Date Subject
Next Message Daniel Verite 2019-11-04 17:41:59 Re: updating unaccent.rules for Arabic letters
Previous Message kerbrose khaled 2019-11-03 06:05:25 updating unaccent.rules for Arabic letters