Re: copy_from does not stop reading after an error

From: Nicolas Grilly <nicolas(at)gardentechno(dot)com>
To: Federico Di Gregorio <federico(dot)digregorio(at)dndg(dot)it>, psycopg(at)postgresql(dot)org
Subject: Re: copy_from does not stop reading after an error
Date: 2011-02-01 12:57:03
Message-ID: AANLkTimSyph9d15GnfmyAOjC=4sMXpTWZbW1taS63ayr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

I have tested again the \copy command of psql and, contrary to what I wrote
before, it looks like psql does not stop reading after an error either, and
has to read the complete file before reporting the error.

Conclusion: psycopg2, pg8000 and psql have the same behavior regarding the
command "copy from stdin". The input data file is read entirely, even if
there is incorrect data at the start of the file, and errors are reported
only after having read the complete file.

Therefore it is probably not a bug in psycopg2, and just a "limitation" of
PostgreSQL protocol. Here is the protocol official documentation:
http://www.postgresql.org/docs/9.0/static/protocol-flow.html#PROTOCOL-COPY

I understand we have to "end" the copy before having a chance to retrieve
PostgreSQL backend response and know if our data are correct, or not. Do you
confirm this analysis?

It means copy_from is not designed to send a 10 gigabytes stream to
PostgreSQL, with just one "copy from stdin" command. Maybe I have to split
my input stream into smaller chunks and execute a "copy from stdin" command
for each of them. Do you confirm this is the only (and adequate) solution?

Thanks for you help and advice.

PS: I've copied that email to lighthouse for future reference.

On Tue, Feb 1, 2011 at 12:34, Nicolas Grilly <nicolas(at)gardentechno(dot)com>wrote:

> Thank you Federico for your answer.
>
> I have ran the same script with pg8000, and it does not stop reading after
> an error either... Maybe it is not a bug, and just a limitation of the
> PostgreSQL protocol? Maybe the copy from protocol is not designed to return
> errors in the middle of the data stream, and I just have to split my data
> stream into many chunks and call copy_from for each chunk?
>
> On Tue, Feb 1, 2011 at 11:48, Federico Di Gregorio <
> federico(dot)digregorio(at)dndg(dot)it> wrote:
>
>> On 01/02/11 11:24, Nicolas Grilly wrote:
>> > It looks like copy_from does not stop reading after an error. When the
>> > input file is short, it is okay. But when the input file is very long,
>> > it is really boring to wait for the entire file to be read just to
>> > discover there is an error on the 10th row.
>>
>> Thank you very much. I just used your email to create a bug report on
>> Lighthouse:
>>
>>
>> http://psycopg.lighthouseapp.com/projects/62710-psycopg/tickets/37-copy_from-does-not-stop-reading-after-an-error
>>
>> federico
>>
>
>

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Marko Kreen 2011-02-01 13:18:29 Re: copy_from does not stop reading after an error
Previous Message Federico Di Gregorio 2011-02-01 11:50:07 Re: At FOSDEM