Re: COPY enhancements

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Emmanuel Cecchet <manu(at)asterdata(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-10-09 15:42:19
Message-ID: alpine.GSO.2.01.0910091133490.29520@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 9 Oct 2009, Tom Lane wrote:

> what do we do with rows that fail encoding conversion? For logging to a
> file we could/should just decree that we write out the original,
> allegedly-in-the-client-encoding data. I'm not sure what we do about
> logging to a table though. The idea of storing bytea is pretty
> unpleasant but there might be little choice.

I think this detail can get punted as documented and the error logged, but
not actually handled perfectly. In most use cases I've seen here, saving
the rows to the "reject" file/table is a convenience rather than a hard
requirement anyway. You can always dig them back out of the original
again if you see an encoding error in the logs, and it's rare you can
completely automate that anyway.

The main purpose of the reject file/table is to accumulate things you
might fix by hand or systematic update (i.e. add ",\N" for a missing
column when warranted) before trying a re-import for review. I suspect
the users of this feature would be OK with knowing that can't be 100%
accurate in the face of encoding errors. It's more important that in the
usual case, things like bad delimiters and missing columns, that you can
easily manipulate the rejects as simple text. Making that harder just for
this edge case wouldn't match the priorities of the users of this feature
I've encountered.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2009-10-09 16:24:54 Idle connection timeout
Previous Message Michael Meskes 2009-10-09 14:53:45 Re: Review of "SQLDA support for ECPG"