Re: Practical error logging for very large COPY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Practical error logging for very large COPY
Date: 2005-11-22 14:58:44
Message-ID: 11335.1132671524@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I have committed the sin of omission again.

> Duplicate row violation is the big challenge, but not the only function
> planned. Formatting errors occur much more frequently, so yes we'd want
> to log all of that too. And yes, it would be done in the way you
> suggest.

> Here's a fuller, but still brief sketch:

> COPY ... FROM ....
> [ERRORTABLES format1 [uniqueness1]
> [ERRORLIMIT percent]]

This is getting worse, not better :-(

The general problem that needs to be solved is "trap any error that
occurs during attempted insertion of a COPY row, and instead of aborting
the copy, record the data and the error message someplace else". Seen
in that light, implementing a special path for uniqueness violations is
pretty pointless.

You could almost do this today in about five minutes with a PG_TRY
construct. The hard part is to distinguish errors that COPY can safely
trap from errors that must be allowed to abort the transaction anyway
(usually because the backend won't be in a consistent state if it's not
allowed to do post-abort cleanup). I think the latter class would
mostly be "internal" errors, and so not trapping them shouldn't be a big
problem for usefulness; but we can't simply ignore the possibility that
they would occur during COPY.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Grzegorz Jaskiewicz 2005-11-22 15:24:21 Re: order by, for custom types
Previous Message Tom Lane 2005-11-22 14:45:49 Re: order by, for custom types