Re: [rfc] unicode escapes for extended strings

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sam Mason <sam(at)samason(dot)me(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [rfc] unicode escapes for extended strings
Date: 2009-04-17 21:33:59
Message-ID: 28978.1240004039@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sam Mason <sam(at)samason(dot)me(dot)uk> writes:
> On Fri, Apr 17, 2009 at 07:01:47PM +0200, Martijn van Oosterhout wrote:
>> On Fri, Apr 17, 2009 at 07:07:31PM +0300, Marko Kreen wrote:
>>> Btw, is there any good reason why we don't reject \000, \x00
>>> in text strings?
>>
>> Why forbid nulls in text strings?

> As far as I know, PG assumes, like most C code, that strings don't
> contain embedded NUL characters.

Yeah; we should reject them because nothing will behave very sensibly
with them, eg

regression=# select E'abc\000xyz';
?column?
----------
abc
(1 row)

The point has come up before, and I kinda thought we *had* changed the
lexer to reject \000. I see we haven't though. Curiously, this
does fail:

regression=# select U&'abc\0000xyz';
ERROR: invalid byte sequence for encoding "SQL_ASCII": 0x00
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

though that's not quite the message I'd have expected to see.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Kreen 2009-04-17 21:55:33 Re: [rfc] unicode escapes for extended strings
Previous Message Tom Lane 2009-04-17 21:27:36 Re: [rfc] unicode escapes for extended strings