Re: Extra Vietnamese unaccent rules

From: Dang Minh Huong <kakalot49(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Kha Nguyen <nlhkha(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extra Vietnamese unaccent rules
Date: 2017-05-29 15:22:28
Message-ID: 69AE3AFD-BA0B-4A41-B32C-BA62CF7C70DB@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On May 29, 29 Heisei, at 10:47, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>
> On Sun, May 28, 2017 at 7:55 PM, Dang Minh Huong <kakalot49(at)gmail(dot)com> wrote:
>> Thanks for reporting and lecture about unicode.
>> I attached a patch as the instruction from Thomas. Could you confirm it.
>
> - is_plain_letter(table[codepoint.combining_ids[0]]) and \
> + (is_plain_letter(table[codepoint.combining_ids[0]]) or\
> + len(table[codepoint.combining_ids[0]].combining_ids) > 1) and \
>
> Shouldn't you use "or is_letter_with_marks()", instead of "or len(...)
>> 1"? Your test might catch something that isn't based on a 'letter'
> (according to is_plain_letter). Otherwise this looks pretty good to
> me. Please add it to the next commitfest.

Thanks for confirm, sir.
I will add it to the next CF soon.

> I expect that some users in Vietnam will consider this to be a bugfix,
> which raises the question of whether to backpatch it. Perhaps we
> could consider fixing it for 10. Then users of older versions could
> grab the rules file from 10 to use with 9.whatever if they want to do
> that and reindex their data as appropriate.

I am also inclined to the fixing it for 10, because it will not affect to current users.
But do you want to back-patch to all supported versions Kha Nguyen?
# I would also want to note that, not only Vietnamese characters were missed to add from the rule list.

---
Thanks and best regards,
Dang Minh Huong

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-05-29 15:45:37 Re: pg_resetwal is broken if run from v10 against older version of PG data directory
Previous Message Magnus Hagander 2017-05-29 14:30:04 Re: Fix a typo in execExpr.c