Re: TOAST compression

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TOAST compression
Date: 2006-02-26 20:19:45
Message-ID: 1140985185.3716.44.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, P, 2006-02-26 kell 09:31, kirjutas Luke Lonergan:
> Jim,
>
> On 2/26/06 8:00 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:
>
> > Any idea on how decompression time compares to IO bandwidth? In other
> > words, how long does it take to decompress 1MB vs read that 1MB vs read
> > whatever the uncompressed size is?
>
> On DBT-3 data, I've just run some tests meant to simulate the speed
> differences of compression versus native I/O. My thought is that an
> external use of gzip on a binary dump file should be close to the speed of
> LZW on toasted fields,

Your basic assumption si probbaly wrong :(

gzip what ? "compression level" setting of gzip has big effect on both
compression speed and compression rate. And I suspect that even the
fastest level (gzip -1) compresses slower and better than postgresql's
lzcompress.

> so I just dumped the "supplier" table (see below) of
> size 202MB in data pages to disk, then ran gzip/gunzip on the the binary
> file. Second test - an 8k block dd from that same file, meant to simulate a
> seq scan (it's faster by 25% than doing it in PG though):
>
> ==================== gzip/gunzip =====================
> [mppdemo1(at)salerno0]$ ls -l supplier.bin
> -rw-r--r-- 1 mppdemo1 mppdemo1 177494266 Feb 26 09:17 supplier.bin
>
> [mppdemo1(at)salerno0]$ time gzip supplier.bin
>
> real 0m12.979s
> user 0m12.558s
> sys 0m0.400s
> [mppdemo1(at)salerno0]$ time gunzip supplier.bin
>
> real 0m2.286s
> user 0m1.713s
> sys 0m0.573s

these are also somewhat bogus tests, if you would want them to be
comparable with dd below, you should have used 'time gzip -c
supplier.bin > /dev/null'

> [mppdemo1(at)salerno0]$ time dd if=supplier.bin of=/dev/null bs=8k
> 21666+1 records in
> 21666+1 records out
>
> real 0m0.138s
> user 0m0.003s
> sys 0m0.135s

----------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2006-02-26 20:59:50 Re: What's with this lib suffix?
Previous Message Tino Wildenhain 2006-02-26 19:20:28 Re: Pl/Python -- current maintainer?