Re: PostgreSQL 8.4 performance tuning questions

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Matthew Wakeling <matthew(at)flymine(dot)org>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PostgreSQL 8.4 performance tuning questions
Date: 2009-07-30 21:20:05
Message-ID: C6975C95.DD9D%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 7/30/09 1:58 PM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> Scott Carey <scott(at)richrelevance(dot)com> wrote:
>
>> Now, what needs to be known with the pg_dump is not just how fast
>> compression can go (assuming its gzip) but also what the duty cycle
>> time of the compression is. If it is single threaded, there is all
>> the network and disk time to cut out of this, as well as all the CPU
>> time that pg_dump does without compression.
>
> Well, I established a couple messages back on this thread that pg_dump
> piped to psql to a database on the same machine writes the 70GB
> database to disk in two hours, while pg_dump to a custom format file
> at default compression on the same machine writes the 50GB file in six
> hours. No network involved, less disk space written. I'll try it
> tonight at -Z0.

So, I'm not sure what the pg_dump custom format overhead is minus the
compression -- there is probably some non-compression overhead from that
format other than the compression.

-Z1 might be interesting too, but obviously it takes some time. Interesting
that your uncompressed case is only 40% larger. For me, the compressed dump
is in the range of 20% the size of the uncompressed one.

>
> One thing I've been wondering about is what, exactly, is compressed in
> custom format. Is it like a .tar.gz file, where the compression is a
> layer over the top, or are individual entries compressed?

It is instructive to open up a compressed custom format file in 'less' or
another text viewer.

Basically, it is the same as the uncompressed dump with all the DDL
uncompressed, but the binary chunks compressed. It would seem (educated
guess, looking at the raw file, and not the code) that the table data is
compressed and the DDL points to an index in the file where the compressed
blob for the copy lives.

> If the
> latter, what's the overhead on setting up each compression stream? Is
> there some minimum size before that kicks in? (I know, I should go
> check the code myself. Maybe in a bit. Of course, if someone already
> knows, it would be quicker....)

Gzip does have some quirky performance behavior depending on the chunk size
of data you stream into it.

>
> -Kevin
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2009-07-30 21:43:08 Re: PostgreSQL 8.4 performance tuning questions
Previous Message Kevin Grittner 2009-07-30 20:58:27 Re: PostgreSQL 8.4 performance tuning questions