pg_dump far too slow

From: David Newall <postgresql(at)davidnewall(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: pg_dump far too slow
Date: 2010-03-14 08:01:37
Message-ID: 4B9C97E1.40509@davidnewall.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Evening all,

Maiden post to this list. I've a performance problem for which I'm
uncharacteristically in need of good advice.

I have a read-mostly database using 51GB on an ext3 filesystem on a
server running Ubuntu 9.04 and PG 8.3. Forty hours ago I started a
plain-format dump, compressed with -Z9, and it is still running, having
produced 32GB of an expected 40 - 45GB of compressed output. CPU load
is 100% on the core executing pg_dump, and negligible on all others
cores. The system is read-mostly, and largely idle. The exact
invocation was:

nohup time pg_dump -f database.dmp -Z9 database

I presumed pg_dump was CPU-bound because of gzip compression, but a test
I ran makes that seem unlikely: Copying the database files to a USB hard
drive (cp -r /var/lib/postgresql/8.3/main /mnt) took 25 minutes; and
gzip-compressing the first first 500MB of the dumpfile (dd
if=database.dmp bs=64k count=16000 | time gzip -9 > dd.gz) took one
minute and 15 seconds; to gzip the complete 51GB set of files should
take no more than 90 minutes.

The database is unremarkable except for one table, the biggest, which
contains a bytea column, and which pg_dump has been outputting for at
least 39 hours. That table has 276,292 rows, in which the bytea for
140,695 contains PDFs totalling 32,791MB, and the bytea for the
remaining 135,597 rows contains PostScript totalling 602MB. I think
I've never done a full vacuum; only ever auto-vacuum; however I did copy
the table to new, deleted the old, and renamed, which I expect is
effectively equivalent for it; which is described by the following schema:

Table "database.bigtable"
Column | Type | Modifiers
--------------+-------------------+--------------------
headerid | integer | not null
member | numeric(10,0) | not null
postcode | character varying |
bsp | character varying |
details | bytea | not null
membertypeid | integer | not null default 0
Indexes:
"bigtable_pkey" PRIMARY KEY, btree (headerid, member)
"bigtable_member" btree (member)
Foreign-key constraints:
"bigtable_headerid_fkey" FOREIGN KEY (headerid) REFERENCES header(headerid)

The following describes the application environment:

* PostgreSQL 8.3.8 on i486-pc-linux-gnu, compiled by GCC
gcc-4.3.real (Ubuntu 4.3.3-5ubuntu4) 4.3.3
* pg_dump (PostgreSQL) 8.3.8
* Ubuntu 9.04
* Linux server 2.6.28-13-server #45-Ubuntu SMP Tue Jun 30 20:51:10
UTC 2009 i686 GNU/Linux
* Intel(R) Xeon(R) CPU E5430 @ 2.66GHz (4 core)
* RAM 2GB
* 2 SATA, 7200rpm disks with hardware RAID-1 (IBM ServeRAID)

My question is, what's going on?

Thanks,

David

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message fkater@googlemail.com 2010-03-14 16:31:05 Re: Deleting bytea, autovacuum, and 8.2/8.4 differences
Previous Message Dave Crooke 2010-03-14 03:04:41 Re: Deleting bytea, autovacuum, and 8.2/8.4 differences