From: Steven Rosenstein <srosenst(at)us(dot)ibm(dot)com>
To: pgsql-perform <pgsql-performance(at)postgresql(dot)org>
Subject:
Date: 2005-05-03 15:55:15
Message-ID: OF93C70832.2AC188CE-ON85256FF6.0056B587-85256FF6.005774C2@us.ibm.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In our application we have tables that we regularly load with 5-10 million
records daily. We *were* using INSERT (I know... Still kicking ourselves
for *that* design decision), and we now converting over to COPY. For the
sake of robustness, we are planning on breaking the entire load into chunks
of a couple hundred thousand records each. This is to constrain the amount
of data we'd have to re-process if one of the COPYs fails.

My question is, are there any advantages, drawbacks, or outright
restrictions to using multiple simultaneous COPY commands to load data into
the same table? One issue that comes to mind is the loss of data
sequencing if we have multiple chunks interleaving records in the table at
the same time. But from a purely technical point of view, is there any
reason why the backend would not be happy with two or more COPY commands
trying to insert data into the same table at the same time? Does COPY take
out any locks on a table?

Thanks in advance,
--- Steve

Responses

  • Re: at 2005-05-03 16:12:21 from Tom Lane

Browse pgsql-performance by date

  From Date Subject
Next Message Kris Jurka 2005-05-03 15:57:21 Re: batch inserts are "slow"
Previous Message Josh Berkus 2005-05-03 15:41:43 Re: batch inserts are "slow"