Rapidly decaying performance repopulating a large table

From: "David Wilson" <david(dot)t(dot)wilson(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Rapidly decaying performance repopulating a large table
Date: 2008-04-22 20:31:42
Message-ID: e7f9235d0804221331x2b127d2td2e25b09db7cb641@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a fairly simple table (a dozen real/integer columns, a few
indexes, one foreign key reference) with ~120m rows. Periodically the
table is truncated or dropped and recreated and the data is
regenerated (slightly different data, of course, or the exercise would
be rather pointless). The regeneration occurs in batches of ~4000 data
points at a time, which are inserted into the table via COPY, and are
coming from several simultaneous processes.

The first several million data points are quite quick (the copy
executes in well under a quarter second). By the time the table
reaches 10-15m rows, however, each individual COPY is taking >20
seconds to execute. Is there anything I can do to improve this
performance? I can't drop/recreate the indices because some of the
data points rely on points generated already in the run, and dropping
the indices would make the required joins ridiculously expensive once
the table starts growing. The foreign key reference *is* droppable for
this regeneration, but I wouldn't expect it to be a performance
problem.

The autovacuum daemon is running in the background, with these
settings: (All autovacuum-specific settings are still at defaults)
vacuum_cost_delay = 50 # 0-1000 milliseconds
vacuum_cost_page_hit = 1 # 0-10000 credits
vacuum_cost_page_miss = 10 # 0-10000 credits
vacuum_cost_page_dirty = 20 # 0-10000 credits
vacuum_cost_limit = 200 # 1-10000 credits

My gut feeling is that better autovacuum settings would help, but I'm
not really sure what to modify to get the improvement I'm looking for.

--
- David T. Wilson
david(dot)t(dot)wilson(at)gmail(dot)com

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2008-04-22 20:38:29 Re: Rapidly decaying performance repopulating a large table
Previous Message John DeSoi 2008-04-22 20:09:21 Re: Schema migration tools?