Re: BUG #13440: unaccent does not remove all diacritics

From: Léonard Benedetti <benedetti(at)mlpo(dot)fr>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13440: unaccent does not remove all diacritics
Date: 2016-02-17 21:07:41
Message-ID: 56C4E11D.5050809@mlpo.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

12/02/2016 17:44, Teodor Sigaev wrote :
> I'm inclining to commit this patch becouse it suggests more regular
> way to update unaccent rules. That is nice.
>
> But I have some notices:
> 1 Is it possible to do not restrict generator script to Python V2?
> Python V2, seems, will go away in near future, and it will not be
> comfortable to install V2 for a single task.

Yes I agree, it makes sense; the script was originally Python 2 but
Python 2 is legacy. Moreover, adapting the script for Python 3 seems
trivial.

> 2 As it's easy to see, nowhere in sources of pgsql there is no a UTF-8
> encoding, just ASCII. I don't see reason to make an exception for this
> script.

First of all, the majority of pgsql code is C, a language where default
encoding is not the same everywhere (may depend on the locale settings
or the compiler) so it is logical to use ASCII.

On the other hand, UTF-8 encoding for source code is *a feature of
Python 3* (to quote the documentation: “The default encoding for Python
source code is UTF-8”) so there is no possible ambiguity, and it will
not be a problem. That said, some non-ASCII characters may be removed
without prejudice from the source code of the script (I think in
particular to "“" and "”"). Nevertheless, for some comments, it would be
unfortunate (e.g. “# RegEx to parse rules (e.g. “Đ → D ; […]”)” or “# ℃
°C”).

>
> Thank you.
>

Thus, I propose to adapt the code to Python 3 (the encoding of the
script does not seem to be a problem for the above reasons). I try to do
it shortly.

Thank you for your feedback.

Léonard Benedetti

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Venkata Balaji N 2016-02-17 22:21:58 Re: BUG #13962: transaction logs growing on standby
Previous Message Jeff Frost 2016-02-17 17:27:41 Re: BUG #13968: invalid page in block error