Re: Extra Vietnamese unaccent rules

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Dang Minh Huong <kakalot49(at)gmail(dot)com>, Kha Nguyen <nlhkha(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extra Vietnamese unaccent rules
Date: 2017-06-03 23:12:12
Message-ID: CAB7nPqQg4jioETBudh0VhpS6s3NWmC4OWqcTiCx_ZHBa8p19_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 29, 2017 at 10:47 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> [Quoting Michael]
>>> Actually, with the recent work that has been done with
>>> unicode_norm_table.h which has been to transpose UnicodeData.txt into
>>> user-friendly tables, shouldn't the python script of unaccent/ be
>>> replaced by something that works on this table? This does a canonical
>>> decomposition but just keeps the first characters with a class
>>> ordering of 0. So we have basic APIs able to look at UnicodeData.txt
>>> and let caller do decision making with the result returned.
>>
>> Thanks, i will learning about it.
>
> It seems like that could be useful for runtime use (I'm sure there is
> a whole world of Unicode support we could add), but here we're only
> trying to generate a mapping file to add to the source tree, so I'm
> not sure how it's relevant.

Yes, that's what I am coming at, but that would be really dictionnary
specific and that would be roughly to provide a fast-path equivalent
to the tsearch_readline* routines working on files. The addition of
new infrastructure may perhaps not be worth the performance gains.
Definitely for this fix there is no need to do anything more
complicated than tweaking the script to generate the rules.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-06-04 02:39:31 Re: PG10 transition tables, wCTEs and multiple operations on the same table
Previous Message Andres Freund 2017-06-03 22:47:15 Re: Index created in BEFORE trigger not updated during INSERT