Re: COPY issue(gsoc project)

From: Neil Conway <neilc(at)samurai(dot)com>
To: longlong <asfnuts(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY issue(gsoc project)
Date: 2008-03-11 22:18:46
Message-ID: 1205273926.23742.25.camel@dell.linuxdev.us.dell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2008-03-11 at 20:56 +0800, longlong wrote:
> This would be a nice feature. Right now there are often applications
> where there is a data loading or staging table that ends up being
> merged with a larger table after some cleanup. Moving that data from
> the preperation area into the final table right now is most easily
> done with INSERT INTO X (SELECT A,B FROM C) type actions. This is
> slow because INSERT takes much longer than COPY.

Why would INSERT INTO ... SELECT be any slower than COPY ... FROM
SELECT?

> 2.this come from TODO list: COPY always behaviors like a unit of work
> thar consists of some insert commands, if any error, it rollback. but
> sometimes we only care the data should be inserted. in that situation,
> i used to use "try....catch...." insert row by row to skip the error,
> because it will take much time to examine every row. so:
> Allow COPY to report error lines and continue.
> this is a good idea.

Search the archives for prior discussions of this idea; the
implementation will require some careful thought. This is a relevant
thread:

http://markmail.org/message/y3atxu56s2afgidg

Note also that pg_bulkload currently does something analogous to this
outside of the DBMS proper:

http://pgbulkload.projects.postgresql.org/

> which one should i choose to proposal or both?

FWIW, error handling for COPY sounds like a more useful project to me.

-Neil

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-03-11 22:31:21 Re: plpgsql and qualified variable names
Previous Message Jonathan Guthrie 2008-03-11 21:26:56 BUG #4027: backslash escaping not disabled in plpgsql