Re: Bulkloading using COPY - ignore duplicates?

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
Cc: Jim Buttafuoco <jim(at)buttafuoco(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bulkloading using COPY - ignore duplicates?
Date: 2001-12-17 20:59:01
Message-ID: Pine.LNX.4.30.0112171817590.642-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Lee Kindness writes:

> Consider SELECT DISTINCT - which is the 'duplicate' and which one is
> the good one?

It's not the same thing. SELECT DISTINCT only eliminates rows that are
completely the same, not only equal in their unique contraints.

Maybe you're thinking of SELECT DISTINCT ON (). Observe the big warning
that the result of that statement are random unless ORDER BY is used. --
But that's not the same thing either. We've never claimed that the COPY
input has an ordering assumption. In fact you're asking for a bit more
than an ordering assumption, you're saying that the earlier data is better
than the later data. I think in a random use case that is more likely
*not* to be the case because the data at the end is newer.

Btw., here's another concern about this proposed feature: If I do a
client-side COPY, how will you sent the "ignored" rows back to the client?

--
Peter Eisentraut peter_e(at)gmx(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lamar Owen 2001-12-17 20:59:34 Re: Explicit config patch 7.2B4
Previous Message Peter Eisentraut 2001-12-17 20:58:01 Re: Explicit config patch 7.2B4