Re: Bulkloading using COPY - ignore duplicates?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Jim Buttafuoco <jim(at)buttafuoco(dot)net>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bulkloading using COPY - ignore duplicates?
Date: 2001-12-18 15:04:08
Message-ID: 13185.1008687848@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Lee Kindness <lkindness(at)csl(dot)co(dot)uk> writes:
> You're right - I was meaning 'SELECT DISTINCT ON ()'. However I'm only
> using it as an example of where the database is choosing (be it
> randomly) the data to discarded.

Not a good example to support your argument. The entire point of
DISTINCT ON (imho) is that the rows that are kept or discarded are
*not* random, but can be selected by the user by specifying additional
sort columns. DISTINCT ON would be pretty useless if it weren't for
that flexibility. The corresponding concept in COPY will need to
provide flexible means for deciding which row to keep and which to
drop, else it'll be pretty useless.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-12-18 15:08:43 Re: Connection Pooling, a year later
Previous Message Bruce Momjian 2001-12-18 14:43:27 Concerns about this release