Re: pg_dump far too slow

From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: David Newall <postgresql(at)davidnewall(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump far too slow
Date: 2010-03-16 03:04:45
Message-ID: ca24673e1003152004r471e4be5t37ad32511b097c09@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

As a fellow PG newbie, some thoughts / ideas ....

1. What is the prupose of the dump (backup, migration, ETL, etc.)? Why
plain? Unless you have a need to load this into a different brand of
database at short notice, I'd use native format.

2. If you goal is indeed to get the data into another DB, use an app which
can do a binary-to-binary transfer, e.g. a little homegrown tool in Java
that connects to both with JDBC, or a data migration ETL tool.

3. If pg_dump is still CPU bound, then don't get pg_dump to compress the
archive, instead do *pg_dump -F c -Z 0 ... | gzip >foo.dmp.gz* ... this
way the compression runs on a different core from the formatting

4. Don't use *-Z9*, the return on investment isn't worth it (esp. if you are
CPU bound), use the default GZIP compression instead, or if you need to
minimize storage, experiment with higher levels until the CPU running GZIP
is close to, but not totally, maxed out.

5. I see no other drives mentioned ... is your dump being written to a
partition on the same RAID-1 pair that PG is running on? Spring for another
drive to avoid the seek contention ... even if you were to stream the dump
to a temporary filesystem on a single commodity consumer drive ($99 for a
1.5TB SATA-300 spindle) with no RAID, you could then copy it back to the
RAID set after pg_dump completes, and I'd give you good odds it'd be a
quicker end to end process.

Cheers
Dave

On Sun, Mar 14, 2010 at 3:01 AM, David Newall <postgresql(at)davidnewall(dot)com>wrote:

> Evening all,
>
> Maiden post to this list. I've a performance problem for which I'm
> uncharacteristically in need of good advice.
>
> I have a read-mostly database using 51GB on an ext3 filesystem on a server
> running Ubuntu 9.04 and PG 8.3. Forty hours ago I started a plain-format
> dump, compressed with -Z9, and it is still running, having produced 32GB of
> an expected 40 - 45GB of compressed output.
>
<snip>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-03-16 03:13:02 Re: Re: [PERFORM] [offtopic] Problems subscribing to Postgres mailing lists
Previous Message Robert Haas 2010-03-16 01:58:57 Re: GiST index performance