Re: Compression and on-disk sorting

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>, Greg Stark <gsstark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Compression and on-disk sorting
Date: 2006-05-21 11:44:27
Message-ID: 20060521114427.GD10443@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 19, 2006 at 01:39:45PM -0500, Jim C. Nasby wrote:
> > Do you have any stats on CPU usage? Memory usage?
>
> I've only been taking a look at vmstat from time-to-time, and I have yet
> to see the machine get CPU-bound. Haven't really paid much attention to
> memory. Is there anything in partucular you're looking for? I can log
> vmstat for the next set of runs (with a scaling factor of 10000). I plan
> on doing those runs tonight...

I've got some more info on zlibs memory usage:

Compression: 5816 bytes + 256KB buffer = approx 261KB
Decompression: 9512 bytes + 32KB buffer = approx 42KB

As Tom said, you only run one compression at a time but logtape doesn't
know that. It can only free the compression structures on Rewind or
Freeze, neither of which are run until the merge pass. I don't
understand the algorithm enough to know if it's safe to rewind the old
tape in selectnewtape. That would seem to defeat the "freeze if only
one tape" optimisation.

One final thing, with trace_sort=on on my machine I get this with
compression:

LOG: performsort done (except 28-way final merge): CPU 1.48s/7.49u sec elapsed 10.24 sec
LOG: external sort ended, 163 disk blocks used: CPU 1.48s/7.49u sec elapsed 10.30 sec

and without compression:

LOG: performsort done (except 28-way final merge): CPU 2.85s/1.90u sec elapsed 14.76 sec
LOG: external sort ended, 18786 disk blocks used: CPU 2.88s/1.90u sec elapsed 15.70 sec

This indicates an awful lot of I/O waiting, some 60% of the time
without compression. The compression has cut the I/O wait from 10sec to
1.5sec at the expense of 5.5sec of compression time. If you had a
faster compression algorithm (zlib is not that fast) the results would
be even better...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2006-05-21 16:57:04 COMMIT leads to ROLLBACK
Previous Message Josh Berkus 2006-05-20 19:03:53 Re: [pgsql-advocacy] Toward A Positive Marketing Approach.