Re: Unexpected behaviour of encode()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Unexpected behaviour of encode()
Date: 2013-03-29 02:26:10
Message-ID: 10292.1364523970@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jasen Betts <jasen(at)xnet(dot)co(dot)nz> writes:
> On 2013-03-26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The manual says that 'escape' encoding "merely outputs null bytes as
>> \000 and doubles backslashes".

>> (Having said that, I wonder though if "escape" doesn't need more
>> thought. The output is only valid text in SQL_ASCII or single-byte
>> encodings, otherwise there's risk of encoding violations.)

> it does that too, since as long as I can remember.
> I used decode-hex here so it'll work on older version of pg.

Hah ... that's what I get for believing the manual ;-). The code
comments tell the truth:

* We must escape zero bytes and high-bit-set bytes to avoid generating
* text that might be invalid in the current encoding, or that might
* change to something else if passed through an encoding conversion
* (leading to failing to de-escape to the original bytea value).
* Also of course backslash itself has to be escaped.

It appears that the manual's statement was correct before 8.3, but
when somebody fixed the code to deal with the encoding issue, they
didn't fix the manual. I'll go improve that ...

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2013-03-29 11:58:31 Re: ts_tovector() to_query()
Previous Message Chris Angelico 2013-03-29 00:05:41 Re: Money casting too liberal?