Re: Improve compression speeds in pg_lzcompress.c

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Takeshi Yamamuro <yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improve compression speeds in pg_lzcompress.c
Date: 2013-01-07 21:36:25
Message-ID: CAHyXU0znAiQ8Ok1ZjZ3-bufzsjZXBLpFDPDOWiW7bY5gCnVh7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 7, 2013 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Mon, Jan 7, 2013 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Takeshi Yamamuro <yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>>> The attached is a patch to improve compression speeds with loss of
>>>> compression ratios in backend/utils/adt/pg_lzcompress.c.
>
>>> Why would that be a good tradeoff to make? Larger stored values require
>>> more I/O, which is likely to swamp any CPU savings in the compression
>>> step. Not to mention that a value once written may be read many times,
>>> so the extra I/O cost could be multiplied many times over later on.
>
>> I disagree. pg compression is so awful it's almost never a net win.
>> I turn it off.
>
> One report doesn't make it useless, but even if it is so on your data,
> why would making it even less effective be a win?

That's a fair point. I'm neutral on the OP's proposal -- it's just
moving spots around the dog. If we didn't have better options, maybe
offering options to tune what we have would be worth implementing...
but by your standard ISTM we can't even do *that*.

>>> Another thing to keep in mind is that the compression area in general
>>> is a minefield of patents. We're fairly confident that pg_lzcompress
>>> as-is doesn't fall foul of any, but any significant change there would
>>> probably require more research.
>
>> A minefield of *expired* patents. Fast lz based compression is used
>> all over the place -- for example by the lucene.
>
> The patents that had to be dodged for original LZ compression are gone,
> true, but what's your evidence for saying that newer versions don't have
> newer patents?

That's impossible (at least for a non-attorney) to do because the
patents are still flying (for example:
http://www.google.com/patents/US7650040). That said, you've framed
the debate so that any improvement to postgres compression requires an
IP lawyer. That immediately raises some questions:

*) why hold only compression type features in postgres to that
standard? Patents get mentioned here and there in the context of
other features in the archives but only compression seems to require a
proven clean pedigree. Why don't we require a patent search for
other interesting features? What evidence do *you* offer that lz4
violates any patents?

*) why is postgres the only FOSS project that cares about
patentability of say, lz4? (google 'lz4 patent')

merlin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2013-01-07 21:44:50 Re: Improve compression speeds in pg_lzcompress.c
Previous Message Tom Lane 2013-01-07 21:19:36 Re: Improve compression speeds in pg_lzcompress.c