Skip site navigation (1) Skip section navigation (2)

Re: Improve compression speeds in pg_lzcompress.c

From: Takeshi Yamamuro <yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improve compression speeds in pg_lzcompress.c
Date: 2013-01-08 09:04:24
Message-ID: 50EBE118.1030409@lab.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

 >>>> Why would that be a good tradeoff to make? Larger stored values 
require
 >>>> more I/O, which is likely to swamp any CPU savings in the compression
 >>>> step. Not to mention that a value once written may be read many times,
 >>>> so the extra I/O cost could be multiplied many times over later on.
 >>> I agree with this analysis, but I note that the test results show it
 >>> actually improving things along both parameters.
 >> Hm ... one of us is reading those results backwards, then.
I think that it's a parameter-tuning issue.
I added the two parameters, PGLZ_SKIP_SIZE and PGLZ_HASH_GAP, and
set PGLZ_SKIP_SIZE=3 and PGLZ_HASH_GAP=8 for the quick tests.
And also, I found that the performance in my patch was nearly
equal to that in the current implementation when
PGLZ_SKIP_SIZE=1 and PGLZ_HASH_GAP=1.

Apart from my patch, what I care is that the current one might
be much slow against I/O. For example, when compressing
and writing large values, compressing data (20-40MiB/s) might be
a dragger against writing data in disks (50-80MiB/s). Moreover,
IMHO modern (and very fast) I/O subsystems such as SSD make a
bigger issue in this case.

Then, I think it's worth keeping discussions to improve
compression stuffs for 9.4, or later.


 > Another thing to keep in mind is that the compression area in general
 > is a minefield of patents.  We're fairly confident that pg_lzcompress
 > as-is doesn't fall foul of any, but any significant change there would
 > probably require more research.
Agree, and we know ...
we need to have patent-free ideas to improve compression issues.
For example, pluggable compression IF, or something.


 > I just went back and looked. Unless I'm misreading it he has about a 2.5
 > times speed improvement but about a 20% worse compression result.
 >
 > What would be interesting would be to see if the knobs he's tweaked
 > could be tweaked a bit more to give us substantial speedup without
 > significant space degradation.
Yes, you're right, and these results highly depend
on data sets though.


regards,
-- 
----
Takeshi Yamamuro
NTT Cyber Communications Laboratory Group
Software Innovation Center
(Open Source Software Center)
Tel: +81-3-5860-5057 Fax: +81-3-5463-5490
Mail:yamamuro(dot)takeshi(at)lab(dot)ntt(dot)co(dot)jp


In response to

Responses

pgsql-hackers by date

Next:From: Kohei KaiGaiDate: 2013-01-08 09:05:19
Subject: Re: recent ALTER whatever .. SET SCHEMA refactoring
Previous:From: jamesDate: 2013-01-08 06:45:50
Subject: Re: json api WIP patch

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group