Skip site navigation (1) Skip section navigation (2)

Re: plpython issue with Win64 (PG 9.2)

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Asif Naeem <asif(dot)naeem(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: plpython issue with Win64 (PG 9.2)
Date: 2012-06-28 22:36:47
Message-ID: 4FECDC7F.90403@wulczer.org (view raw or flat)
Thread:
Lists: pgsql-hackers
On 27/06/12 13:57, Jan Urbański wrote:
> On 27/06/12 11:51, Asif Naeem wrote:
>> Hi,
>>
>> On Windows 7 64bit, plpython is causing server crash with the following
>> test case i.e.
>>
>> CREATE PROCEDURAL LANGUAGE 'plpython3u';
>>> CREATE OR REPLACE FUNCTION pymax (a integer, b integer)
>>> RETURNS integer
>>> AS $$
>>> if a> b:
>>> return a
>>> return b
>>> $$ LANGUAGE plpython3u;
>>> SELECT pymax(1, 2);
>
>>
>> I think primary reason that trigger this issue is when Function
>> PLyUnicode_Bytes() calls "PyUnicode_AsEncodedString( ,WIN1252 /*Server
>> encoding*/, ) " it fails with null. I built latest pg 9.2 source code
>> with
>> python 3.2.2.3 by using Visual Studio 2010. Thanks.
>
> I'll try to reproduce this on Linux, which should be possible given the
> results of your investigation.

Your analysis is correct, I managed to reproduce this by injecting

serverenc = "win1252";

into PLyUnicode_Bytes. The comment in that function says that Python 
understands all PostgreSQL encoding names except for SQL_ASCII, but 
that's not really true. In your case GetDatabaseEncodingName() returns 
"WIN1252" and Python accepts "CP125".

I'm wondering how this should be fixed. Just by adding more special 
cases in PLyUnicode_Bytes?

Even if we add a switch statement that would convert PG_WIN1250 into 
"CP1250", Python can still raise an exception when encoding (for various 
reasons). How about replacing the PLy_elog there with just an elog? This 
loses traceback support and the Python exception message, which could be 
helpful for debugging (something like "invalid character <foo> for 
encoding cp1250"). OTOH, I'm uneasy about invoking the entire PLy_elog 
machinery from a function that's as low-level as PLyUnicode_Bytes.

Lastly, we map SQL_ASCII to "ascii" which is arguably wrong. The 
function is supposed to return bytes in the server encoding, and under 
SQL_ASCII that probably means we can return anything (ie. use any 
encoding we deem useful). Using "ascii" as the Python codec name will 
raise an error on anything that has the high bit set.

So: I'd add code to translate WINxxx into CPxxx when choosing the Python 
to use, change PLy_elog to elog in PLyUnicode_Bytes and leave the 
SQL_ASCII case alone, as there were no complaints and people using 
SQL_ASCII are asking for it anyway.

Cheers,
Jan

In response to

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2012-06-28 23:24:01
Subject: Re: initdb check_need_password fix
Previous:From: Daniel FarinaDate: 2012-06-28 22:01:51
Subject: Re: Uh, I change my mind about commit_delay + commit_siblings (sort of)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group