Quick Links

Re: Skipping duplicate records?

From:	Marc SCHAEFER <schaefer(at)alphanet(dot)ch>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Skipping duplicate records?
Date:	2001-06-08 12:54:53
Message-ID:	Pine.LNX.3.96.1010607094151.988E-100000@defian.alphanet.ch
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Thu, 7 Jun 2001, Steve Micallef wrote:

> 'mysqlimport' has the ability to skip duplicate records when doing bulk
> imports from non-binary files. PostgreSQL doesn't seem to have this
> feature, and it causes a problem for me as I import extremely large
> amounts of data into Postgres using 'copy' and it rejects the whole file
> if one record breaches the primary key.

As a quick comment, I personnally find the above to be a *feature*. If
something goes wrong during the COPY, I really want this to be handled in
a transactional manner, and just not do anything. Else it's a pain to find
out WHAT was really inserted, etc.

Your problem is really that your input data is incorrect: it doesn't
respect the constraints you want on the data.

You could:

- import the data in an non-constrained table (no UNIQUE nor
PRIMARY KEY), when import is complete, remove the duplicates

assuming id is your to-be-primary-key:

SELECT t1.id
FROM temp_table t1, temp_table t2
WHERE (t1.id = t2.id) AND (t2.oid != t2.oid);

And now it's up to you to think and see which one of those records
with the duplicate IDs are the one to keep.

In response to

Skipping duplicate records? at 2001-06-06 23:55:35 from Steve Micallef

Browse pgsql-general by date

	From	Date	Subject
Next Message	Olivier Cherrier	2001-06-08 12:57:51	compile error using libpq
Previous Message	Olivier Cherrier	2001-06-08 12:50:24	Rappel : libpq compile error