Quick Links

Re: Force lookahead in COPY FROM parsing

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Force lookahead in COPY FROM parsing
Date:	2021-04-06 17:50:11
Message-ID:	6e1305a7-09c0-1e1f-f2a3-68e641aab431@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 02/04/2021 20:21, John Naylor wrote:
> I have nothing further so it's RFC. The patch is pretty simple compared
> to the earlier ones, but is worth running the fuzzer again as added
> insurance?

Good idea. I did that, and indeed it revealed bugs. If the client sent
just a single byte in one CopyData message, we only loaded that one byte
into the buffer, instead of the full 4 bytes needed for lookahead.
Attached is a new version that fixes that.

Unfortunately, that's not the end of it. Consider the byte sequence
"\.<NL><some invalid bytes>" appearing at the end of the input. We
should detect the end-of-copy marker \. and stop reading without
complaining about the garbage after the end-of-copy marker. That doesn't
work if we force 4 bytes of lookahead; the invalid byte sequence fits in
the lookahead window, so we will try to convert it.

I'm sure that can be fixed, for example by adding special handling for
the last few bytes of the input. But it needs some more thinking, this
patch isn't quite ready to be committed yet :-(.

- Heikki

Attachment	Content-Type	Size
v4-0001-Simplify-COPY-FROM-parsing-by-forcing-lookahead.patch	text/x-patch	9.1 KB

In response to

Re: Force lookahead in COPY FROM parsing at 2021-04-02 17:21:59 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2021-04-06 18:02:31	Re: Minimal logical decoding on standbys
Previous Message	Mark Wong	2021-04-06 17:39:48	Re: GSoc Applicant