Re: Bulkloading using COPY - ignore duplicates?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bulkloading using COPY - ignore duplicates?
Date: 2001-10-01 13:36:36
Message-ID: 22968.1001943396@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Lee Kindness <lkindness(at)csl(dot)co(dot)uk> writes:
> Would this seem a reasonable thing to do? Does anyone rely on COPY
> FROM causing an ERROR on duplicate input?

Yes. This change will not be acceptable unless it's made an optional
(and not default, IMHO, though perhaps that's negotiable) feature of
COPY.

The implementation might be rather messy too. I don't much care for the
notion of a routine as low-level as bt_check_unique knowing that the
context is or is not COPY. We might have to do some restructuring.

> Would:
> WITH ON_DUPLICATE = CONTINUE|TERMINATE (or similar)
> need to be added to the COPY command (I hope not)?

It occurs to me that skip-the-insert might be a useful option for
INSERTs that detect a unique-key conflict, not only for COPY. (Cf.
the regular discussions we see on whether to do INSERT first or
UPDATE first when the key might already exist.) Maybe a SET variable
that applies to all forms of insertion would be appropriate.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2001-10-01 13:37:37 When scripting, which is better?
Previous Message Vince Vielhaber 2001-10-01 12:45:48 developer's faq