Skip site navigation (1) Skip section navigation (2)

Re: Are there plans to add data compression feature to postgresql?

From: Grant Allen <gxallen(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Are there plans to add data compression feature to postgresql?
Date: 2008-10-29 23:53:27
Message-ID: 4908F777.8010001@gmail.com (view raw or flat)
Thread:
Lists: pgsql-general
Tom Lane wrote:
> =?utf-8?Q?=E5=B0=8F=E6=B3=A2_=E9=A1=BE?= <guxiaobo1982(at)hotmail(dot)com> writes:
>   
>> [ snip a lot of marketing for SQL Server ]
>>     
>
> I think the part of this you need to pay attention to is
>
>   
>> Of course, nothing is entirely free, and this reduction in space and
>> time come at the expense of using CPU cycles.
>>     
>
> We already have the portions of this behavior that seem to me to be
> likely to be worthwhile (such as NULL elimination and compression of
> large field values).  Shaving a couple bytes from a bigint doesn't
> strike me as interesting.

Think about it on a fact table for a warehouse.  A few bytes per bigint 
multiplied by several billions/trillions of bigints (not an exaggeration 
in a DW) and you're talking some significant storage saving on the main 
storage hog in a DW.  Not to mention the performance _improvements_ you 
can get, even with some CPU overhead for dynamic decompression, if the 
planner/optimiser understands how to work with the compression index/map 
to perform things like range/partition elimination etc.  Admittedly this 
depends heavily on the storage mechanics and optimisation techniques of 
the DB, but there is value to be had there ... IBM is seeing typical 
storage savings in the 40-60% range, mostly based on boring, 
bog-standard int, char and varchar data.

The IDUG (so DB2 users themselves, not IBM's marketing) had a 
competition to see what was happening in the real world, take a look if 
interested: http://www.idug.org/wps/portal/idug/compressionchallenge

Other big benefits come with XML ... but that is even more dependent on 
the starting point.  Oracle and SQL Server will see big benefits in 
compression with this, because their XML technology is so 
mind-bogglingly broken in the first place.

So there's certainly utility in this kind of feature ... but whether it 
rates above some of the other great stuff in the PostgreSQL pipeline is 
questionable.

Ciao
Fuzzy
:-)

------------------------------------------------
Dazed and confused about technology for 20 years
http://fuzzydata.wordpress.com/


In response to

Responses

pgsql-general by date

Next:From: Kevin GalliganDate: 2008-10-30 00:13:39
Subject: Re: FW: Slow query performance
Previous:From: Dann CorbitDate: 2008-10-29 23:52:48
Subject: Re: FW: Slow query performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group