Re: Table and Index compression

From: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
To: "Sam Mason" <sam(at)samason(dot)me(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Table and Index compression
Date: 2009-08-11 10:05:39
Message-ID: op.uyhszpgkcke6l8@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Well, here is the patch. I've included a README, which I paste here.
If someone wants to play with it (after the CommitFest...) feel free to
do so.
While it was an interesting thing to try, I don't think it has enough
potential to justify more effort...

* How to test

- apply the patch
- copy minilzo.c and minilzo.h to
src/backend/storage/smgr

- configure & make
- enjoy

* How it works

- pg block size set to 32K
- an extra field is added in the header telling the compressed length

THIS IS BAD, this information should be stored in a separate fork of the
relation, because
- it would then be backwards compatible
- the number of bytes to read from a compressed page would be known in
advance

- the table file is sparse
- the page header is not compressed
- pages are written at their normal positions, but only the compressed
bytes are written
- if compression gains nothing, un-compressed page is stored
- the filesystem doesn't store the un-written blocks

* Benefits

- Sparse file holes are not cached, so OS disk cache efficiency is at
least x2
- Random access is faster, having a better probability to hit cache
(sometimes a bit faster, sometimes it's spectatular)
- Yes, it does save space (> 50%)

* Problems

- Biggest problem : any write to a table that writes data that compresses
less than whatever was there before can fail on a disk full error.

- ext3 sparse file handling isn't as fast as I wish it would be : on seq
scans, even if it reads 2x less data, and decompresses very fast, it's
still slower...

- many seq scans (especially with aggregates) are CPU bound anyway

- therefore, some kind of background-reader-decompressor would be needed

- pre-allocation has to be done to avoid extreme fragmentation of the
file, which kind of defeats the purpose

- it still causes fragmentation

* Conclusion (for now)

It was a nice thing to try, but I believe it would be better if this was
implemented directly in the filesystem, on the condition that it should be
implemented well (ie not like NTFS compression).

Attachment Content-Type Size
pg_8.4.0_compression_patch_v001.tar.gz application/x-gzip 26.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sam Mason 2009-08-11 10:35:32 Re: Patch for 8.5, transformationHook
Previous Message Peter Eisentraut 2009-08-11 09:42:36 Re: Shipping documentation untarred