Re: Perform COPY FROM encoding conversions in larger chunks

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Perform COPY FROM encoding conversions in larger chunks
Date: 2021-02-07 18:13:28
Message-ID: 21e3331f-d4f8-4750-d004-74a5abae42ec@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/02/2021 23:42, John Naylor wrote:
> Although a new patch is likely forthcoming, I did take a brief look and
> found the following:
>
>
> In copyfromparse.c, this is now out of date:
>
>  * Read the next input line and stash it in line_buf, with conversion to
>  * server encoding.
>
>
> One of your FIXME comments seems to allude to this, but if we really
> need a difference here, maybe it should be explained:
>
> +#define INPUT_BUF_SIZE 65536 /* we palloc INPUT_BUF_SIZE+1 bytes */
>
> +#define RAW_BUF_SIZE 65536 /* allocated size of the buffer */

We do in fact still need the +1 for the NUL terminator. It was missing
from the last patch version, but that was wrong; my fuzz testing
actually uncovered a bug caused by that. Fixed.

Attached are new patch versions. The first patch is same as before, but
rebased, pgindented, and with a couple of tiny fixes where conversion
functions were still missing the "if (noError) break;" checks.

I've hacked on the second patch more, doing more refactoring and
commenting for readability. I think it's in pretty good shape now.

- Heikki

Attachment Content-Type Size
v4-0001-Add-noError-argument-to-encoding-conversion-funct.patch text/x-patch 225.1 KB
v4-0002-Do-COPY-FROM-encoding-conversion-verification-in-.patch text/x-patch 39.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-02-07 18:20:29 Re: jsonb_array_elements_recursive()
Previous Message Peter Geoghegan 2021-02-07 18:12:03 Re: GlobalVisIsRemovableFullXid() vs GlobalVisCheckRemovableXid()