Re: Unaccent extension python script Issue in Windows

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: hugh(at)whtc(dot)ca
Cc: raam(dot)soft(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com
Subject: Re: Unaccent extension python script Issue in Windows
Date: 2019-03-18 05:13:34
Message-ID: 20190318.141334.186469242.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Sun, 17 Mar 2019 20:23:05 -0400, Hugh Ranalli <hugh(at)whtc(dot)ca> wrote in <CAAhbUMNoBLu7jAbyK5MK0LXEyt03PzNQt_Apkg0z9bsAjcLV4g(at)mail(dot)gmail(dot)com>
> Hi Ram,
> Thanks for doing this; I've been overestimating my ability to get to things
> over the last couple of weeks.
>
> I've looked at the patch and have made one minor change. I had moved all
> the imports up to the top, to keep them in one place (and I think some had
> originally been used only by the Python 2 code. You added them there, but
> didn't remove them from their original positions. So I've incorporated that
> into your patch, attached as v2. I've tested this under Python 2 and 3 on
> Linux, not Windows.

Though I'm not sure the necessity of running the script on
Windows, the problem is not specific for Windows, but general one
that haven't accidentially found on non-Windows environment.

On CentOS7:
> export LANG="ja_JP.EUCJP"
> python <..snipped..>
..
> UnicodeEncodeError: 'euc_jp' codec can't encode character '\xab' in position 0: illegal multibyte sequence

So this is not an issue with Windows but with python3.

The script generates identical files with the both versions of
python with the pach on Linux and Windows 7. Python3 on Windows
emits CRLF as a new line but it doesn't seem to harm. (I didn't
confirmed that due to extreme slowness of build from uncertain
reasons now..)

This patch contains irrelevant changes. The minimal required
change would be the attached. If you want refacotor the
UnicodeData reader or rearrange import sutff, it should be
separate patches.

It would be better use IOBase for Python3 especially for stdout
replacement but I didin't since it *is* working.

> Everything else looks correct. I apologise for not having replied to your
> question in the original bug report. I had intended to, but as I said,
> there's been an increase in the things I need to juggle at the moment.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v3_unaccent_python3_compatibility.patch text/x-patch 965 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-03-18 05:43:08 Re: Online verification of checksums
Previous Message Tom Lane 2019-03-18 04:45:19 Re: Determine if FOR UPDATE or FOR SHARE was used?