Re: Bulkloading using COPY - ignore duplicates?

From: Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
To: Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk>
Cc: Lee Kindness <lkindness(at)csl(dot)co(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bulkloading using COPY - ignore duplicates?
Date: 2001-12-13 13:25:11
Message-ID: 15384.44087.553142.160600@elsick.csl.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Patrick Welche writes:
> On Mon, Oct 01, 2001 at 03:17:43PM +0100, Lee Kindness wrote:
> > Please, don't get me wrong - I don't want to come across arrogant. I'm
> > simply trying to improve the 'COPY FROM' command in a situation where
> > speed is a critical issue and the data is dirty... And that must be a
> > relatively common scenario.
> Isn't that when you do your bulk copy into into a holding table, then
> clean it up, and then insert into your live system?

That's what I'm currently doing as a workaround - a SELECT DISTINCT
from a temporary table into the real table with the unique index on
it. However this takes absolute ages - say 5 seconds for the copy
(which is the ballpark figure I aiming toward and can achieve with
Ingres) plus another 30ish seconds for the SELECT DISTINCT.

The majority of database systems out there handle this situation in
one manner or another (MySQL ignores or replaces; Ingres ignores;
Oracle ignores or logs; others...). Indeed PostgreSQL currently checks
for duplicates in the COPY code but throws an elog(ERROR) rather than
ignoring the row, or passing the error back up the call chain.

My use of PostgreSQL is very time critical, and sadly this issue alone
may force an evaluation of Oracle's performance in this respect!

Best regards, Lee Kindness.

--
Lee Kindness, Senior Software Engineer, Concept Systems Limited.
http://services.csl.co.uk/ http://www.csl.co.uk/ +44 131 5575595

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2001-12-13 13:30:05 Re: Platform Testing - Cygwin
Previous Message Jean-Paul ARGUDO 2001-12-13 13:14:10 Hi all and greatings