Tom Lane writes:
> Lee Kindness <lkindness(at)csl(dot)co(dot)uk> writes:
> > Would this seem a reasonable thing to do? Does anyone rely on COPY
> > FROM causing an ERROR on duplicate input?
> Yes. This change will not be acceptable unless it's made an optional
> (and not default, IMHO, though perhaps that's negotiable) feature of
I see where you're coming from, but seriously what's the use/point of
COPY aborting and doing a rollback if one duplicate key is found? I
think it's quite reasonable to presume the input to COPY has had as
little processing done on it as possible. I could loop through the
input file before sending it to COPY but that's just wasting cycles
and effort - Postgres has btree lookup built in, I don't want to roll
my own before giving Postgres my input file!
> The implementation might be rather messy too. I don't much care
> for the notion of a routine as low-level as bt_check_unique knowing
> that the context is or is not COPY. We might have to do some
Well in reality it wouldn't be "you're getting run from copy" but
rather "notice on duplicate, rather than error & exit". There is a
telling comment in nbtinsert.c just before _bt_check_unique() is
* If we're not allowing duplicates, make sure the key isn't already
* in the index. XXX this belongs somewhere else, likely
So perhaps dupes should be searched for before _bt_doinsert is called,
or somewhere more appropriate?
> > Would:
> > WITH ON_DUPLICATE = CONTINUE|TERMINATE (or similar)
> > need to be added to the COPY command (I hope not)?
> It occurs to me that skip-the-insert might be a useful option for
> INSERTs that detect a unique-key conflict, not only for COPY. (Cf.
> the regular discussions we see on whether to do INSERT first or
> UPDATE first when the key might already exist.) Maybe a SET variable
> that applies to all forms of insertion would be appropriate.
That makes quite a bit of sense.
Lee Kindness, Senior Software Engineer
Concept Systems Limited.
In response to
pgsql-hackers by date
|Next:||From: Tom Lane||Date: 2001-10-01 14:02:54|
|Subject: Re: Bulkloading using COPY - ignore duplicates? |
|Previous:||From: Justin Clift||Date: 2001-10-01 13:48:19|
|Subject: Re: Spinlock performance improvement proposal|