Re: confusing / inefficient "need_transcoding" handling in copy

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org, Sutou Kouhei <kou(at)clear-code(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: confusing / inefficient "need_transcoding" handling in copy
Date: 2024-02-06 22:24:45
Message-ID: 20240206222445.hzq22pb2nye7rm67@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-02-06 12:51:48 -0500, Tom Lane wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > On Mon, Feb 05, 2024 at 06:05:04PM -0800, Andres Freund wrote:
> >> I haven't yet dug into the code history. One guess is that this should only
> >> have been set this way for COPY FROM.
>
> > Looking the git history, this looks like an oversight of c61a2f58418e
> > that has added the condition on pg_database_encoding_max_length(), no?
> > Adding Tom and Ishii-san, even if this comes from 2005.
>
> Yeah, back in 8.1 that code was applied for both directions, but
> probably it should have enforced validation for same-encoding
> cases only for COPY FROM.
>
> It looks like now we have a mess, because the condition was copied
> verbatim into copyto.c but not copyfrom.c. Aren't we failing to
> validate encoding in this case in COPY FROM, which is where we
> actually need to?

I think the code is just very confusing - there actually *is* verification of
the encoding, it just happens at a different, earlier, layer, namely in
copyfromparse.c: CopyConvertBuf() which says:
/*
* If the file and server encoding are the same, no encoding conversion is
* required. However, we still need to verify that the input is valid for
* the encoding.
*/

And does indeed verify that.

One unfortunate issue: We don't have any tests verifying that COPY FROM
catches encoding issues.

Regards,

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-02-06 22:29:28 Re: Properly pathify the union planner
Previous Message Nathan Bossart 2024-02-06 21:50:22 Re: Psql meta-command conninfo+