Re: BUG #17142: COPY ignores client_encoding for octal digit characters

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: vilarion(at)illarion(dot)org, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17142: COPY ignores client_encoding for octal digit characters
Date: 2021-08-12 07:40:35
Message-ID: d06c9ac0-1e22-7247-8b98-3c13d550d43d@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 12/08/2021 00:24, PG Bug reporting form wrote:
> Characters in octal digits should be possible as per
> https://www.postgresql.org/docs/13/sql-copy.html
> When using characters directly (char buffer[] = "\304\366\337") the expected
> output is displayed.
>
> My apologies if I misunderstood something.

The code is pretty clear that the \123 and \x12 escapes are evaluated
after encoding conversion. That means, the escapes are interpreted using
the database encoding, regardless of client encoding. The documentation
doesn't say anything about that, though. We should fix the docs. How
does the attached patch look?

You could get weird results if you use the escapes for some bytes in a
multi-byte character. Mostly you'd get invalid byte sequence errors, but
I think with the right combination of the client and database encodings,
it could get more strange. I think the wording in the attached docs
patch is enough to cover that, though.

- Heikki

Attachment Content-Type Size
0001-Doc-123-and-x12-escapes-in-COPY-are-in-database-enco.patch text/x-patch 1.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message vilarion 2021-08-12 08:01:56 Re: BUG #17142: COPY ignores client_encoding for octal digit characters
Previous Message Emil Iggland 2021-08-12 06:47:56 Re: BUG #17141: SELECT LIMIT WITH TIES FOR UPDATE SKIP LOCKED returns wrong number of rows