Re: invalidly encoded strings

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: invalidly encoded strings
Date: 2007-09-09 18:00:35
Message-ID: 1189360835.5924.14.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Sun, 2007-09-09 at 10:51 -0400, Tom Lane wrote:
> A possible answer is to add a verifymbstr to the string literal
> converter anytime it has processed a numeric backslash-escape in the
> string. Open questions for that are (1) does it have negative effects
> for bytea, and if so is there any hope of working around it? (2) how
> can we postpone the conversion/test to the parse analysis phase?
> (To the extent that database encoding is frozen it'd probably be OK
> to do it in the scanner, but such a choice will come back to bite
> us eventually.)

Regarding #1:

Currently, you can pass a bytea literal as either: E'\377\377\377' or
E'\\377\\377\\377'.

The first strategy (single backslash) is not correct, because if you do
E'\377\000\377', the embedded null character counts as the end of the
cstring, even though there are bytes after it. Similar strange things
happen if you have a E'\134' (backslash) somewhere in the string.
However, I have no doubt that there are people who use the first
strategy anyway, and the proposed change would break that for them.

There may be more issues, too.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-09-09 20:29:11 Re: Are we done with sync-commit-defaults-to-off patch?
Previous Message Andrew Dunstan 2007-09-09 17:18:23 Re: invalidly encoded strings

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-09-09 21:06:01 Re: invalidly encoded strings
Previous Message Andrew Dunstan 2007-09-09 17:18:23 Re: invalidly encoded strings