Re: BUG #15548: Unaccent does not remove combining diacritical characters

From: Ramanarayana <raam(dot)soft(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Hugh Ranalli <hugh(at)whtc(dot)ca>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15548: Unaccent does not remove combining diacritical characters
Date: 2019-02-12 13:54:20
Message-ID: CAKm4Xs4zKcNYW=-E9C8h_o74xhOrw4miZRK0krya1puEqKAECA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi Michael,
The issue was that the python script was working in python 2 but not in
python 3 in Windows. This is because the python script writes the final
output to stdout and stdout encoding is set to utf-8 only for python 2 but
not python 3.If no encoding is set for stdout it takes the encoding from
the Operating system.Default encoding in linux and windows might be
different.Hence this issue.
Regards,
Ram.

On Tue, 12 Feb 2019 at 09:48, Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Tue, Feb 12, 2019 at 02:27:31AM +0530, Ramanarayana wrote:
> > I tested the script in python 2.7 and it works perfect. The problem is in
> > python 3.7(and may be only in windows as you were not getting the issue)
> > and I was getting the following error
> >
> > UnicodeEncodeError: 'charmap' codec can't encode character '\u0100' in
> > position 0: character maps to <undefined>
> >
> > I went through the python script and found that the stdout encoding is
> set
> > to utf-8 only if python version is <=2.
> >
> > I have made the same change for python version 3 as well. Please find the
> > patch for the same.Let me know if it makes sense
>
> Isn't that because Windows encoding becomes cp1252, utf16 or such?
> FWIW, on Debian SID with Python 3.7, I get the correct output, and no
> diffs on HEAD. Perhaps it would make sense to use open() on the
> different files with encoding='utf-8' to avoid any kind of problems?
> --
> Michael
>

--
Cheers
Ram 4.0

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Eisentraut 2019-02-12 14:00:34 Re: BUG #15631: Generated as identity field in a temporary table with on commit drop corrupts system catalogs
Previous Message Dean Rasheed 2019-02-12 10:33:33 Re: BUG #15623: Inconsistent use of default for updatable view

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2019-02-12 14:05:01 Re: libpq compression
Previous Message Alvaro Herrera 2019-02-12 13:53:43 Re: Too rigorous assert in reorderbuffer.c