Quick Links

Re: Bulkloading using COPY - ignore duplicates?

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
Cc:	Jim Buttafuoco <jim(at)buttafuoco(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Bulkloading using COPY - ignore duplicates?
Date:	2001-12-17 20:59:01
Message-ID:	Pine.LNX.4.30.0112171817590.642-100000@peter.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Lee Kindness writes:

> Consider SELECT DISTINCT - which is the 'duplicate' and which one is
> the good one?

It's not the same thing. SELECT DISTINCT only eliminates rows that are
completely the same, not only equal in their unique contraints.

Maybe you're thinking of SELECT DISTINCT ON (). Observe the big warning
that the result of that statement are random unless ORDER BY is used. --
But that's not the same thing either. We've never claimed that the COPY
input has an ordering assumption. In fact you're asking for a bit more
than an ordering assumption, you're saying that the earlier data is better
than the later data. I think in a random use case that is more likely
*not* to be the case because the data at the end is newer.

Btw., here's another concern about this proposed feature: If I do a
client-side COPY, how will you sent the "ignored" rows back to the client?

--
Peter Eisentraut peter_e(at)gmx(dot)net

In response to

Re: Bulkloading using COPY - ignore duplicates? at 2001-12-17 12:48:30 from Lee Kindness

Responses

Re: Bulkloading using COPY - ignore duplicates? at 2001-12-18 10:09:14 from Lee Kindness

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Lamar Owen	2001-12-17 20:59:34	Re: Explicit config patch 7.2B4
Previous Message	Peter Eisentraut	2001-12-17 20:58:01	Re: Explicit config patch 7.2B4