Re: Extra Vietnamese unaccent rules

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dang Minh Huong <kakalot49(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Kha Nguyen <nlhkha(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extra Vietnamese unaccent rules
Date: 2017-08-16 21:01:37
Message-ID: 27527.1502917297@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dang Minh Huong <kakalot49(at)gmail(dot)com> writes:
> On 2017/07/05 15:28, Michael Paquier wrote:
>> (Surprised to see that generate_unaccent_rules.py is inconsistent on
>> MacOS, runs fine on Linux).

FWIW, I got identical results from running the script on current macOS
(Sierra) and Linux (RHEL6).

>> Testing with characters having two accents, the results are produced
>> as wanted. I am attaching an updated patch with all those
>> simplifications. Thoughts?

> Thanks, so pretty. The patch is fine to me.

Pushed into v11. I'm not really qualified to review the Python coding
style, but I did fix a typo in a comment.

BTW, while this isn't a reason to delay this patch, I wonder whether
the regression test for unaccent is really adequate. According to
https://coverage.postgresql.org/contrib/unaccent/unaccent.c.gcov.html
it isn't doing anything to check multicharacter source strings, and
what's considerably more disturbing, it isn't exercising the PG_CATCH
code that's meant to deal with characters outside the current database's
encoding.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-16 21:06:42 Re: Function to move the position of a replication slot
Previous Message Andres Freund 2017-08-16 20:34:52 Re: Function to move the position of a replication slot