pg_dump performance

From: Jared Mauch <jared(at)puck(dot)nether(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: pg_dump performance
Date: 2007-12-26 20:22:30
Message-ID: 20071226202230.GA76524@puck.nether.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


I've been looking at the performance of pg_dump in
the past week off and on trying to see if I can get it to
work a bit faster and was looking for tips on this.

doing a pg_dump on my 34311239 row table (1h of data btw)
results in a wallclock time of 187.9 seconds or ~182k rows/sec.

I've got my insert (COPY) performance around 100k/sec and
was hoping to get the reads to be much faster. The analysis I'm
doing is much faster doing a pg_dump than utilizing a few
queries for numerous reasons. (If you care, I can enumerate them
to you privately but the result is pg_dump is the best way to handle
the multiple bits of analysis that are needed, please trust me).

What i'm seeing:

pg_dump is utilizing about 13% of the cpu and the
corresponding postgres backend is at 100% cpu time.
(multi-core, multi-cpu, lotsa ram, super-fast disk).

I'm not seeing myself being I/O bound so was interested
if there was a way I could tweak the backend performance or
offload some of the activity to another process.

pg8.3(beta) with the following variances from default

checkpoint_segments = 300 # in logfile segments, min 1, 16MB each
effective_cache_size = 512MB # typically 8KB each
wal_buffers = 128MB # min 4, 8KB each
shared_buffers = 128MB # min 16, at least max_connections*2, 8KB each
work_mem = 512MB # min 64, size in KB

unrelated but associated data, the table has one index on it.
not relevant for pg_dump but i'm interested in getting better concurent index
creation (utilize those cpus better but not slow down my row/sec perf)
but that's another topic entirely..

Any tips on getting pg_dump (actually the backend) to perform
much closer to 500k/sec or more? This would also aide me when I upgrade
pg versions and need to dump/restore with minimal downtime (as the data
never stops coming.. whee).

Thanks!

- Jared

--
Jared Mauch | pgp key available via finger from jared(at)puck(dot)nether(dot)net
clue++; | http://puck.nether.net/~jared/ My statements are only mine.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2007-12-26 20:52:08 Re: pg_dump performance
Previous Message Fernando Hevia 2007-12-26 19:32:21 Re: With 4 disks should I go for RAID 5 or RAID 10