DBD::Pg BYTEA Character Escaping

From: David Wheeler <david(at)wheeler(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: DBD::Pg BYTEA Character Escaping
Date: 2001-11-18 04:58:46
Message-ID: 1006059527.1307.14.camel@mercury.atomicode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi All,

I recently noticed that the DBD::Pg Perl module appears to be doing a
lot of work escaping characters for BYTEA data types. It's importing
Perl's POSIX support to check every character in BYTEA data with
isprint(), and replacing it with its octal representation if its not
printable.

However, there are two issues with this approach. The first is
efficiency. The way the code is currently written in DBD::Pg does a
*lot* of unnecessary work, and I'd like to suggest an optimization
(based on discussions on this topic on the Fun with Perl mail list:
http://archive.develooper.com/fwp%40perl.org/msg00458.html -- patch
supplied upon request).

The second issue, however, is that it doesn't appear to me that it's
even necessary that non-printable characters be replaced. Although Alex
Pilosov says that such an approach is needed:

http://www.geocrawler.com/mail/msg.php3?msg_id=6509224&list=10

Joe Conway found that there were only three characters ('\', "'", and
"\0") that needed to be escaped, and it was those three characters that
Bruce Momjian documented for the forthcoming 7.2 release:

http://www.geocrawler.com/mail/msg.php3?msg_id=6547225&list=10

If that's true, then any solution escaping non-printable characters is
overkill, and therefore only the three characters need to be escaped.
And since it looks like two of them ('\' and "'") are already escaped
before the non-printable characters are escaped in DBD::Pg, it then it
seems that this code:

if ($data_type == DBI::SQL_BINARY ||
$data_type == DBI::SQL_VARBINARY ||
$data_type == DBI::SQL_LONGVARBINARY) {
$str=join("", map { isprint($_)?$_:'\\'.sprintf("%03o",ord($_)) }
split //, $str);
}

Could be changed to:

s/\0/\\000/g if $data_type == DBI::SQL_BINARY ||
$data_type == DBI::SQL_VARBINARY ||
$data_type == DBI::SQL_LONGVARBINARY;

So, the reason I'm posting this query is because I'd like to get
confirmation, if possible, on this conclusion. Based on the feedback I
receive, I will submit patches to the DBD::Pg maintainer.

Thanks!

David

PS: If discussion of this issue needs to be moved to the Hackers list,
I'll be happy to do so. I just thought I'd try here, first.

--
David Wheeler AIM: dwTheory
David(at)Wheeler(dot)net ICQ: 15726394
Yahoo!: dew7e
Jabber: Theory(at)jabber(dot)org

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2001-11-18 05:45:17 Re: DBD::Pg BYTEA Character Escaping
Previous Message wyatt 2001-11-17 20:55:42 One time only trigger/function on every row...