Re: pg_dump far too slow

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: David Newall <postgresql(at)davidnewall(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org, robertmhaas(at)gmail(dot)com, dcrooke(at)gmail(dot)com
Subject: Re: pg_dump far too slow
Date: 2010-03-21 13:56:38
Message-ID: 4BA62596.8050604@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 21/03/2010 9:17 PM, David Newall wrote:
> Thanks for all of the suggestions, guys, which gave me some pointers on
> new directions to look, and I learned some interesting things.
>

> Unfortunately one of these processes dropped eventually, and, according
> to top, the only non-idle process running was gzip (100%.) Obviously
> there were postgress and pg_dump processes, too, but they were throttled
> by gzip's rate of output and effectively idle (less than 1% CPU). That
> is also interesting. The final output from gzip was being produced at
> the rate of about 0.5MB/second, which seems almost unbelievably slow.

CPU isn't the only measure of interest here.

If pg_dump and the postgres backend it's using are doing simple work
such as reading linear data from disk, they won't show much CPU activity
even though they might be running full-tilt. They'll be limited by disk
I/O or other non-CPU resources.

> and wonder if I should read up on gzip to find why it would work so
> slowly on a pure text stream, albeit a representation of PDF which
> intrinsically is fairly compressed.

In fact, PDF uses deflate compression, the same algorithm used for gzip.
Gzip-compressing PDF is almost completely pointless - all you're doing
is compressing some of the document structure, not the actual content
streams. With PDF 1.5 and above using object and xref streams, you might
not even be doing that, instead only compressing the header and trailer
dictionary, which are probably in the order of a few hundred bytes.

Compressing PDF documents is generally a waste of time.

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Crooke 2010-03-21 14:33:35 Re: pg_dump far too slow
Previous Message David Newall 2010-03-21 13:17:40 Re: pg_dump far too slow