Re: Compression and on-disk sorting

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>, Greg Stark <gsstark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Compression and on-disk sorting
Date: 2006-05-18 20:17:46
Message-ID: 20060518201746.GQ64371@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 18, 2006 at 08:32:10PM +0200, Martijn van Oosterhout wrote:
> On Thu, May 18, 2006 at 11:22:46AM -0500, Jim C. Nasby wrote:
> > AFAIK logtape currently reads in much less than 256k blocks. Of course
> > if you get lucky you'll read from one tape for some time before
> > switching to another, which should have a sort-of similar effect if the
> > drives aren't very busy with other things.
>
> Logtape works with BLCKSZ blocks. Whether it's advisable or not, I
> don't know. One thing I'm noticing with this compression-in-logtape is
> that the memory cost per tape increases considerably. Currently we rely
> heavily on the OS to cache for us.
>
> Lets say we buffer 256KB per tape, and we assume a large sort with many
> tapes, you're going to blow all your work_mem on buffers. If using
> compression uses more memory so that you can't have as many tapes and
> thus need multiple passes, well, we need to test if this is still a
> win.

So you're sticking with 8K blocks on disk, after compression? And then
decompressing blocks as they come in?

Actually, I guess the amount of memory used for zlib's lookback buffer
(or whatever they call it) could be pretty substantial, and I'm not sure
if there would be a way to combine that across all tapes.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2006-05-18 20:18:17 Re: [OT] MySQL is bad, but THIS bad?
Previous Message Philippe Schmid 2006-05-18 20:17:32 Re: [OT] MySQL is bad, but THIS bad?