Re: pluggable compression support

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: pluggable compression support
Date: 2013-06-16 01:50:31
Message-ID: CA+Tgmob6_NNFp_PcCekwhVraLw_zpNLTFgMTFe4PB-K17ai-gQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 15, 2013 at 8:11 AM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> Claiming that the algorithm will be one of only two (current and
> "whatever algorithm we come up with ") suggests that it is
> only one bit, which is undoubtedly too little for having a "pluggable"
> compression API :)

See http://www.postgresql.org/message-id/20130607143053.GJ29964@alap2.anarazel.de

>> But those identifiers should be *small* (since they are added to all
>> Datums)
> if there will be any alignment at all between the datums, then
> one byte will be lost in the noise ("remember: nobody will need
> more than 256 compression algorithms")
> OTOH, if you plan to put these format markers in the compressed
> stream and change the compression algorithm while reading it, I am lost.

The above-linked email addresses this point as well: there are bits
available in the toast pointer. But there aren't MANY bits without
increasing the storage footprint, so trying to do something that's
more general than we really need is going to cost us in terms of
on-disk footprint. Is that really worth it? And if so, why? I don't
find the idea of a trade-off between compression/decompression speed
and compression ratio to be very exciting. As Andres says, bzip2 is
impractically slow for ... almost everything. If there's a good
BSD-licensed algorithm available, let's just use it and be done. Our
current algorithm has lasted us a very long time; I see no reason to
think we'll want to change this again for another 10 years, and by
that time, we may have redesigned the storage format altogether,
making the limited extensibility of our current TOAST pointer format
moot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-06-16 01:53:59 Re: pluggable compression support
Previous Message Alvaro Herrera 2013-06-16 01:37:43 Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls