Re: Compression of full-page-writes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>, Andres Freund <andres(at)2ndquadrant(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Compression of full-page-writes
Date: 2013-10-24 16:22:59
Message-ID: CA+TgmoZWTg7LY7B34SMMqNszR69nQCy3_uktyh2_tnwf7FmG-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 24, 2013 at 11:40 AM, ktm(at)rice(dot)edu <ktm(at)rice(dot)edu> wrote:
> On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote:
>> On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> > So, our consensus is to introduce the hooks for FPW compression so that
>> > users can freely select their own best compression algorithm?
>> > Also, probably we need to implement at least one compression contrib module
>> > using that hook, maybe it's based on pglz or snappy.
>>
>> I don't favor making this pluggable. I think we should pick snappy or
>> lz4 (or something else), put it in the tree, and use it.
>>
> Hi,
>
> My vote would be for lz4 since it has faster single thread compression
> and decompression speeds with the decompression speed being almost 2X
> snappy's decompression speed. The both are BSD licensed so that is not
> an issue. The base code for lz4 is c and it is c++ for snappy. There
> is also a HC (high-compression) varient for lz4 that pushes its compression
> rate to about the same as zlib (-1) which uses the same decompressor which
> can provide data even faster due to better compression. Some more real
> world tests would be useful, which is really where being pluggable would
> help.

Well, it's probably a good idea for us to test, during the development
cycle, which algorithm works better for WAL compression, and then use
that one. Once we make that decision, I don't see that there are many
circumstances in which a user would care to override it. Now if we
find that there ARE reasons for users to prefer different algorithms
in different situations, that would be a good reason to make it
configurable (or even pluggable). But if we find that no such reasons
exist, then we're better off avoiding burdening users with the need to
configure a setting that has only one sensible value.

It seems fairly clear from previous discussions on this mailing list
that snappy and lz4 are the top contenders for the position of
"compression algorithm favored by PostgreSQL". I am wondering,
though, whether it wouldn't be better to add support for both - say we
added both to libpgcommon, and perhaps we could consider moving pglz
there as well. That would allow easy access to all of those
algorithms from both front-end and backend-code. If we can make the
APIs parallel, it should very simple to modify any code we add now to
use a different algorithm than the one initially chosen if in the
future we add algorithms to or remove algorithms from the list, or if
one algorithm is shown to outperform another in some particular
context. I think we'll do well to isolate the question of adding
support for these algorithms form the current patch or any other
particular patch that may be on the table, and FWIW, I think having
two leading contenders and adding support for both may have a variety
of advantages over crowning a single victor.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-10-24 16:53:48 Re: RULE regression test fragility?
Previous Message Sev Zaslavsky 2013-10-24 15:41:57 LISTEN / NOTIFY enhancement request for Postgresql