Re: Bug in COPY FROM backslash escaping multi-byte chars

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: john(dot)naylor(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug in COPY FROM backslash escaping multi-byte chars
Date: 2021-02-04 01:50:45
Message-ID: 20210204.105045.1361111646208149865.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 3 Feb 2021 15:46:30 +0200, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in
> On 03/02/2021 15:38, John Naylor wrote:
> > On Wed, Feb 3, 2021 at 8:08 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi
> > <mailto:hlinnaka(at)iki(dot)fi>> wrote:
> > >
> > > Hi,
> > >
> > > While playing with COPY FROM refactorings in another thread, I noticed
> > > corner case where I think backslash escaping doesn't work correctly.
> > > Consider the following input:
> > >
> > > \么.foo
> > I've seen multibyte delimiters in the wild, so it's not as outlandish
> > as it seems.
>
> We don't actually support multi-byte characters as delimiters or quote
> or escape characters:
>
> postgres=# copy copytest from 'foo' with (delimiter '么');
> ERROR: COPY delimiter must be a single one-byte character
>
> > The fix is simple enough, so +1.
>
> Thanks, I'll commit and backpatch shortly.

I'm not sure the assumption in the second hunk always holds, but
that's fine at least with Shift-JIS and -2004 since they are two-byte
encoding.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-02-04 02:19:53 Re: Is it useful to record whether plans are generic or custom?
Previous Message tsunakawa.takay@fujitsu.com 2021-02-04 01:50:01 RE: libpq debug log