Skip site navigation (1) Skip section navigation (2)

Re: Improve compression speeds in pg_lzcompress.c

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Takeshi Yamamuro <yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improve compression speeds in pg_lzcompress.c
Date: 2013-01-07 21:36:25
Message-ID: CAHyXU0znAiQ8Ok1ZjZ3-bufzsjZXBLpFDPDOWiW7bY5gCnVh7g@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Mon, Jan 7, 2013 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Mon, Jan 7, 2013 at 10:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Takeshi Yamamuro <yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp> writes:
>>>> The attached is a patch to improve compression speeds with loss of
>>>> compression ratios in backend/utils/adt/pg_lzcompress.c.
>
>>> Why would that be a good tradeoff to make?  Larger stored values require
>>> more I/O, which is likely to swamp any CPU savings in the compression
>>> step.  Not to mention that a value once written may be read many times,
>>> so the extra I/O cost could be multiplied many times over later on.
>
>> I disagree.  pg compression is so awful it's almost never a net win.
>> I turn it off.
>
> One report doesn't make it useless, but even if it is so on your data,
> why would making it even less effective be a win?

That's a fair point.  I'm neutral on the OP's proposal -- it's just
moving spots around the dog.  If we didn't have better options, maybe
offering options to tune what we have would be worth implementing...
but by your standard ISTM we can't even do *that*.

>>> Another thing to keep in mind is that the compression area in general
>>> is a minefield of patents.  We're fairly confident that pg_lzcompress
>>> as-is doesn't fall foul of any, but any significant change there would
>>> probably require more research.
>
>> A minefield of *expired* patents.  Fast lz based compression is used
>> all over the place -- for example by the lucene.
>
> The patents that had to be dodged for original LZ compression are gone,
> true, but what's your evidence for saying that newer versions don't have
> newer patents?

That's impossible (at least for a non-attorney) to do because the
patents are still flying (for example:
http://www.google.com/patents/US7650040).  That said, you've framed
the debate so that any improvement to postgres compression requires an
IP lawyer.  That immediately raises some questions:

*) why hold only compression type features in postgres to that
standard?  Patents get mentioned here and there in the context of
other features in the archives but only compression seems to require a
proven clean pedigree.   Why don't we require a patent search for
other interesting features? What evidence do *you* offer that lz4
violates any patents?

*) why is postgres the only FOSS project that cares about
patentability of say, lz4?  (google 'lz4 patent')

merlin


In response to

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2013-01-07 21:44:50
Subject: Re: Improve compression speeds in pg_lzcompress.c
Previous:From: Tom LaneDate: 2013-01-07 21:19:36
Subject: Re: Improve compression speeds in pg_lzcompress.c

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group