Re: [HACKERS] Trouble with COPY IN

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: James William Pye <lists(at)jwp(dot)name>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Maciek Sakrejda <msakrejda(at)truviso(dot)com>, Samuel Gendler <sgendler(at)ideasculptor(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] Trouble with COPY IN
Date: 2010-07-29 09:15:14
Message-ID: alpine.DEB.2.00.1007290952210.2654@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

(Yes, I know I'm not on the hackers list. Most interested parties should
get this directly anyway.)

>> Additionally the interface exposed by the JDBC driver lets the user
>> write arbitrary CopyData bytes to the server, so without parsing all of
>> that we don't know whether they've issued CopyData(EOF) or not.
>
> Okay, so you can't know with absolute certainty without parsing the
> data, but the usual case would be handled by holding onto the last-N
> bytes or so. Enough to fit the EOF and perhaps a little more for
> paranoia's sake.
>
> That's not to say that I'm missing the problem. When (not "if", "when")
> the user feeds data past a CopyData(EOF), it's going to get interesting.

This is the reason why the patch to the JDBC driver that I sent in is very
fragile. In the case where a user provides a binary copy with lots of data
after the EOF, the processCopyData method *will* get called after the
CommandComplete and ReadyForQuery messages have been received, even if we
try to delay processing of the ReadyForQuery message.

> [Thinking about the logic necessary to handle such a case and avoid
> network buffer deadlock...] I would think the least invasive way to
> handle it would be to set the CommandComplete and ReadyForQuery messages
> aside when they are received if CopyDone hasn't been sent, continue the
> COPY operation as usual until it is shutdown, send CopyDone and,
> finally, "reinstate" CommandComplete and RFQ as if they were just
> received..

Basically, yes. We need to introduce a little more state into the JDBC
driver. Currently, the driver is in one of two states:

1. In the middle of a copy.
2. Not in a copy.

These states are recorded in the lock system. We need to introduce a new
state, where the copy is still locked, but we know that the
CommandComplete and ReadyForQuery messages have been received. We can no
longer unlock the copy in processCopyData - we need to do that in endCopy
instead, after calling processCopyData to ensure that we wait for a valid
CommandComplete and ReadyForQuery message first.

Matthew

--
Terrorists evolve but security is intelligently designed? -- Jake von Slatt

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Henk Enting 2010-07-29 10:57:19 patch for check constraints using multiple inheritance
Previous Message Simon Riggs 2010-07-29 08:58:24 Re: page corruption on 8.3+ that makes it to standby

Browse pgsql-jdbc by date

  From Date Subject
Next Message David Kerr 2010-07-29 18:25:38 Idle in TX / Java process hang's in the vicinity of JDBC
Previous Message Guy Rouillier 2010-07-29 06:30:49 Re: JPA and desktop apps