Re: Compression and on-disk sorting

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: Compression and on-disk sorting
Date: 2006-05-18 10:34:51
Message-ID: 1147948491.2646.427.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Tue, 2006-05-16 at 15:42 -0500, Jim C. Nasby wrote:
> On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote:
> > In any case, my curiousity is aroused, so I'm currently benchmarking
> > pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do
> > some benchmarks with pg_tmp compressed.
>
> Results: http://jim.nasby.net/bench.log
>
> As expected, compressing $PGDATA/base was a loss. But compressing
> pgsql_tmp and then doing some disk-based sorts did show an improvement,
> from 366.1 seconds to 317.3 seconds, an improvement of 13.3%. This is on
> a Windows XP laptop (Dell Latitude D600) with 512MB, so it's somewhat of
> a worst-case scenario. On the other hand, XP's compression algorithm
> appears to be pretty aggressive, as it cut the size of the on-disk sort
> file from almost 700MB to 82MB. There's probably gains to be had from a
> different compression algorithm.

Can you re-run these tests using "SELECT aid FROM accounts ..."
"SELECT 1 ... " is of course highly compressible.

I also note that the compressed file fits within memory and may not even
have been written out at all. That's good, but this sounds like the very
best case of what we can hope to achieve. We need to test a whole range
of cases to see if it is generally applicable, or only in certain cases
- and if so which ones.

Would you be able to write up some extensive testing of Martijn's patch?
He's followed through on your idea and written it (on -patches now...)

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2006-05-18 11:26:16 Re: Google and the Beta Freeze
Previous Message Simon Riggs 2006-05-18 10:34:36 Re: [PATCH] Compression and on-disk sorting

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-05-18 11:42:01 Re: [PATCH] Compression and on-disk sorting
Previous Message Simon Riggs 2006-05-18 10:34:36 Re: [PATCH] Compression and on-disk sorting