From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | Andrew Piskorski <atp(at)piskorski(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Compression and on-disk sorting |
Date: | 2006-05-17 03:48:21 |
Message-ID: | 871wuts456.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Andrew Piskorski <atp(at)piskorski(dot)com> writes:
> The main tricks seem to be: One, EXTREMELY lightweight compression
> schemes - basically table lookups designed to be as cpu friendly as
> posible. Two, keep the data compressed in RAM as well so that you can
> also cache more of the data, and indeed keep it the compressed until
> as late in the CPU processing pipeline as possible.
>
> A corrolary of that is forget compression schemes like gzip - it
> reduces data size nicely but is far too slow on the cpu to be
> particularly useful in improving overall throughput rates.
There are some very fast decompression algorithms:
http://www.oberhumer.com/opensource/lzo/
I think most of the mileage from "lookup tables" would be better implemented
at a higher level by giving tools to data modellers that let them achieve
denser data representations. Things like convenient enum data types, 1-bit
boolean data types, short integer data types, etc.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-05-17 03:51:51 | Re: PL/pgSQL 'i = i + 1' Syntax |
Previous Message | Tom Lane | 2006-05-17 03:48:04 | Re: audit table containing Select statements submitted |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-05-17 04:03:15 | Re: Compression and on-disk sorting |
Previous Message | Bruce Momjian | 2006-05-17 02:18:30 | Re: [HACKERS] .pgpass file and unix domain sockets |