Re: Good way to insert/update when you're not sure of duplicates?

From: Richard Huxton <dev(at)archonet(dot)com>
To: Curtis Stanford <curtis(at)stanfordcomputing(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Good way to insert/update when you're not sure of duplicates?
Date: 2003-09-25 17:27:18
Message-ID: 200309251827.18882.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thursday 25 September 2003 02:23, Curtis Stanford wrote:
> I'm in a situation where I need to load an ASCII file into a database. No
> sweat so far right? The records are indexed by date and I have a unique
> index on date. The ASCII can overlap, meaning it can contain duplicate
> dates that have been loaded before.
>
> I started out with this ingenious idea. Just try to insert the record. If I
> get an error about duplicate keys, do an update and switch to update mode.
> If I'm updating and get zero rows updated, switch back to insert. Works
> fine, except this whole thing is in a transaction and any errors abort the
> transaction and I can't commit it. So, I decided I'd have to try to update
> every single record and, if zero rows were updated, do the insert. This
> takes a looooong time. Many hours for just 86000 records or so.
>
> Is there a standard way to do this. I can't imagine I'm the only guy that
> need to do this kind of thing.

Try inserting a batch of 1024. If you get an error, drop down to 512 and try
that. Repeat until the batch works or you've reached a size of 1.
If the batch worked, try the next set of records and repeat. If you reached a
batch size of 1 with no success then, switch to updating, and repeat the
cycle increasing your batch-size as you go.

You might find it quickest to halve batch-size while having problems then
doubling while it works. The balance is going to depend on how many insert vs
update rows you have.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Curtis Stanford 2003-09-25 17:32:11 Re: Good way to insert/update when you're not sure of duplicates?
Previous Message Josh Berkus 2003-09-25 17:04:24 Re: [GENERAL] PostgreSQL at OSCON 2004