Re: Perl DBI converts UTF-8 again to UTF-8 before sending it to the server

From: Matthias Apitz <guru(at)unixarea(dot)de>
To: Christoph Moench-Tegeder <cmt(at)burggraben(dot)net>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Perl DBI converts UTF-8 again to UTF-8 before sending it to the server
Date: 2019-10-11 14:08:08
Message-ID: 20191011140808.GA6717@c720-r342378
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Christoph,

May I come back to the UTF-8 problem, but now for the reading aspect:

I connect to the PG server with:

$dbh = DBI->connect($PGDB, $PGDB_USER, $PGDB_PASS,
{ pg_utf8_flag => 1,
pg_enable_utf8 => 1,
AutoCommit => 0,
RaiseError => 0,
PrintError => 0,
}
);

and do a SELECT for a column which contains UTF-8 data (I double checked
this with SQL and ::bytea):

$sth=$dbh->prepare(
"select d02name from d02ben where d02bnr = '00001048313'")
or die "parse error\n".$DBI::errstr."\n";

$sth->execute
or die "exec error\n".$DBI::errstr."\n";

but when I now fetch the first row with:

@row = $sth->fetchrow_array;
$HexStr = unpack("H*", $row[0]);
print "HexStr: " . $HexStr . "\n";
print "$row[0]\n";

The resulting column contains ISO data:

HexStr: 50e46461676f67697363686520486f6368736368756c65205765696e67617274656e2020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020
P<E4>dagogische Hochschule Weingarten

Following the man page of DBD::Pg the attribute pg_enable_utf8 => 1
should ensure that strings are returned from DBI with the UTF-8 flag
switched on. The server sends the string in UTF-8 as I can see with
strace:

...
recvfrom(3, "T\0\0\0 \0\1d02name\0\0\1\313\237\0\3\0\0\4\22\377\377\0\0\0|\0\0D\0\0\0\203\0\1\0\0\0yP\303\244dagogische Hochschule Weingarten C\0\0\0\rSELECT 1\0Z\0\0\0\5T", 16384, 0, NULL, NULL) = 185
write(1, "HexStr: 50e46461676f67697363686520486f6368736368756c65205765696e67617274656e2020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020\n", 249) = 249
write(1, "P\344dagogische Hochschule Weingarten

But why it gets translated to ISO?

Thanks for your help again.

matthias
--
Matthias Apitz, ✉ guru(at)unixarea(dot)de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2019-10-11 14:08:37 Re: Issues with PAM : log that it failed, whether it actually failed or not
Previous Message Adrian Klaver 2019-10-11 14:03:27 Re: Too many SET TimeZone and Application_name queries