Re: libpq copy error handling busted

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: libpq copy error handling busted
Date: 2020-06-06 04:30:04
Message-ID: 20200606043004.xyo52casyaman7sq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-06-03 18:41:28 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > When libpq is used to COPY data to the server, it doesn't properly
> > handle errors.
> > This is partially an old problem, and partially got recently
> > worse. Before the below commit we detected terminated connections, but
> > we didn't handle copy failing.
>
> Yeah. After poking at that for a little bit, there are at least three
> problems:
>
> * pqSendSome() is responsible not only for pushing out data, but for
> calling pqReadData in any situation where it can't get rid of the data
> promptly. 1f39a1c06 overlooked that requirement, and the upshot is
> that we don't necessarily notice that the connection is broken (it's
> pqReadData's job to detect that). Putting a pqReadData call into
> the early-exit path helps, but doesn't fix things completely.

Is that fully necessary? Couldn't we handle at least the case I had by
looking at write_failed in additional places?

It might still be the right thing to continue to call pqReadData() from
pqSendSome(), don't get me wrong.

> * The more longstanding problem is that the PQputCopyData code path
> doesn't have any mechanism for consuming an 'E' (error) message
> once pqReadData has collected it. AFAICS that's ancient.

Yea, I looked back quite a bit, and it looked that way for a long
time. I thought for a moment that it might be related to the copy-both
introduction, but it wasn't.

> I think that the idea was to let the client dump all its copy data and
> then report the error message when PQendCopy is called, but as you
> say, that's none too friendly when there's gigabytes of data involved.
> I'm not sure we can do anything about this without effectively
> changing the client API for copy errors, though.

Hm. Why would it *really* be an API change? Until recently connection
failures etc were returned from PQputCopyData(), and it is documented
that way:

/*
* PQputCopyData - send some data to the backend during COPY IN or COPY BOTH
*
* Returns 1 if successful, 0 if data could not be sent (only possible
* in nonblock mode), or -1 if an error occurs.
*/
int
PQputCopyData(PGconn *conn, const char *buffer, int nbytes)

So consuming 'E' when in copy mode doesn't seem like a crazy change to
me. Probably a bit too big to backpatch though. But given that this
hasn't been complained about much in however many years...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-06-06 04:57:20 Vacuuming the operating system documentation
Previous Message Andres Freund 2020-06-06 04:11:46 hashagg slowdown due to spill changes