Re: Compression and on-disk sorting

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: Compression and on-disk sorting
Date: 2006-05-18 16:18:51
Message-ID: 20060518161850.GI64371@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Thu, May 18, 2006 at 11:34:51AM +0100, Simon Riggs wrote:
> On Tue, 2006-05-16 at 15:42 -0500, Jim C. Nasby wrote:
> > On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote:
> > > In any case, my curiousity is aroused, so I'm currently benchmarking
> > > pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do
> > > some benchmarks with pg_tmp compressed.
> >
> > Results: http://jim.nasby.net/bench.log
> >
> > As expected, compressing $PGDATA/base was a loss. But compressing
> > pgsql_tmp and then doing some disk-based sorts did show an improvement,
> > from 366.1 seconds to 317.3 seconds, an improvement of 13.3%. This is on
> > a Windows XP laptop (Dell Latitude D600) with 512MB, so it's somewhat of
> > a worst-case scenario. On the other hand, XP's compression algorithm
> > appears to be pretty aggressive, as it cut the size of the on-disk sort
> > file from almost 700MB to 82MB. There's probably gains to be had from a
> > different compression algorithm.
>
> Can you re-run these tests using "SELECT aid FROM accounts ..."
> "SELECT 1 ... " is of course highly compressible.
>
> I also note that the compressed file fits within memory and may not even
> have been written out at all. That's good, but this sounds like the very
> best case of what we can hope to achieve. We need to test a whole range
> of cases to see if it is generally applicable, or only in certain cases
> - and if so which ones.
>
> Would you be able to write up some extensive testing of Martijn's patch?
> He's followed through on your idea and written it (on -patches now...)

Yes, I'm working on that. I'd rather test his stuff than XP's
compression anyway.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2006-05-18 16:22:46 Re: Compression and on-disk sorting
Previous Message Bruce Momjian 2006-05-18 16:08:09 Re: pg_dump and backslash escapes

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-05-18 16:25:45 Re: [HACKERS] buildfarm failures
Previous Message Bruce Momjian 2006-05-18 16:07:05 Re: Compression and on-disk sorting