Skip site navigation (1) Skip section navigation (2)

Re: pl/perl and utf-8 in sql_ascii databases

From: Alex Hunsaker <badalex(at)gmail(dot)com>
To: Christoph Berg <cb(at)df7cb(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pl/perl and utf-8 in sql_ascii databases
Date: 2012-02-10 19:53:05
Message-ID: CAFaPBrR9y1fu6gpVu+8TA8vTY6QVCm3DfarKT8JG_EhGeTXnDA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, Feb 9, 2012 at 03:21, Christoph Berg <cb(at)df7cb(dot)de> wrote:
> Hi,
>
> we have a database that is storing strings in various encodings (and
> non-encodings, namely the arbitrary byte soup [ ... ]
> For this reason, the database uses
> sql_ascii encoding

> ...snip...

> In sql_ascii databases, utf_e2u does not do any recoding, but then
> SvUTF8_on still marks the string as utf-8, while it isn't.
>
> (Returned values might also need fixing.)
>
> In my view, this is clearly a bug in pl/perl on sql_ascii databases.

Yeah, there was some musing about this over in:
http://archives.postgresql.org/pgsql-hackers/2011-02/msg01142.php

Seems like we missed the fact that we still did SvUTF8_on() in sv2cstr
and SvPVUTF8() when turning a perl string into a cstring.

With the attached I get:
=> create or replace function perl_white(a text) returns text as $$
return shift; $$ language plperlu;
=> select perl_white(E'\200'), perl_white(E'\200')::bytea,
coalesce(perl_white(E'\200'), 'null');
 perl_white | perl_white | coalesce
------------+------------+----------
            | \x80       |

=> select perl_white(E'\401');
 perl_white
------------
 \x01
(1 row)

Does the attached fix the issue for you?

Ill note that all the pls seem to behave a bit differently:

=> create or replace function py_white(a text) returns text as $$
return a; $$ language plpython3u;
=> select py_white(E'\200'), py_white(E'\200')::bytea,
coalesce(py_white(E'\200'), 'null');
py_white | py_white | coalesce
----------+----------+----------
          |          | null
(1 row)

=>select py_white(E'\401');
 py_white
----------
 \x01
(1 row)

=> create or replace function tcl_white(text) returns text as $$
return $1; $$ language pltcl;
=> select tcl_white(E'\200'), tcl_white(E'\200')::bytea,
coalesce(tcl_white(E'\200'), 'null');
 tcl_white | tcl_white | coalesce
-----------+-----------+----------
           | \x80      |

 => select tcl_white(E'\402');
 tcl_white
-----------
 \x02
(1 row)

Attachment: plperl_sql_ascii.patch
Description: text/x-patch (2.4 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2012-02-10 19:54:27
Subject: Re: [GENERAL] pg_dump -s dumps data?!
Previous:From: Robert HaasDate: 2012-02-10 19:27:56
Subject: Re: patch : Allow toast tables to be moved to a different tablespace

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group