Re: Select count(*), the sequel

From: Thomas Kellerer <spam_eater(at)gmx(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Select count(*), the sequel
Date: 2010-10-27 20:54:11
Message-ID: iaa3hl$pkk$1@dough.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kenneth Marshall, 27.10.2010 22:41:
> Different algorithms have been discussed before. A quick search turned
> up:
>
> quicklz - GPL or commercial
> fastlz - MIT works with BSD okay
> zippy - Google - no idea about the licensing
> lzf - BSD-type
> lzo - GPL or commercial
> zlib - current algorithm
>
> Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X
> and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same
> comparison put lzo at 3.0X for compression and 1.8X decompress. The block
> design of lzl/fastlz may be useful to support substring access to toasted
> data among other ideas that have been floated here in the past.
>
> Just keeping the hope alive for faster compression.

What about a dictionary based compression (like DB2 does)?

In a nutshell: it creates a list of "words" in a page. For each word, the occurance in the db-block are stored and the actual word is removed from the page/block itself. This covers all rows on a page and can give a very impressive overall compression.
This compression is not done only on disk but in-memory as well (the page is loaded with the dictionary into memory).

I believe Oracle 11 does something similar.

Regards
Thomas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message André Volpato 2010-10-27 20:56:52 Re: AIX slow buffer reads
Previous Message Kenneth Marshall 2010-10-27 20:41:15 Re: Select count(*), the sequel