Re: BUG #16236: Invalid escape encoding

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: stephane(dot)campinas(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16236: Invalid escape encoding
Date: 2020-01-27 23:05:45
Message-ID: 25295.1580166345@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> From the documentation [0] about the encode function, the "escape" format
> should "convert zero bytes and high-bit-set bytes to octal sequences (\nnn)
> and doubles backslashes."
> However, executing "select encode(E'aaa\bccc', 'escape');" outputs
> "aaa\x08ccc", although according to the documentation I should get
> "aaa\010ccc".

No, I don't think so. The \b gives rise to a byte with hex value 08
(that is, control-H or backspace) in the E'' literal, which converts
to the same byte value in the bytea value that gets passed to
encode(). Since that's not either a zero or a high-bit-set value,
encode() just repeats it literally in the text result, and you end
up with the same thing as if you'd just done

=# select E'aaa\bccc'::text;
text
------------
aaa\x08ccc
(1 row)

I think it must be psql itself that's choosing to represent the
backspace as \x08, because nothing in the backend does that.
(pokes around ... yeah, it's pg_wcsformat() that's doing it)

You could certainly make an argument that encode() ought to
backslashify all ASCII control characters, not only \0. But
it's behaving as documented, AFAICS.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Deepak Rai 2020-01-28 08:43:09 Re: BUG #16232: Database server connection limit exceeding
Previous Message Tom Lane 2020-01-27 22:34:58 Re: BUG #16235: ts_rank ignores match and only considers lower weighted vector