Re: Compression of full-page-writes

From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Compression of full-page-writes
Date: 2013-10-18 06:28:58
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Sorry for my reply late...

(2013/10/11 2:32), Fujii Masao wrote:
> Could you let me know how much WAL records were generated
> during each benchmark?
It was not seen difference hardly about WAL in DBT-2 benchmark. It was because
largest tuples are filled in random character which is difficult to compress, I
survey it.

So I test two pattern data. One is original data which is hard to compress data.
Second is little bit changing data which are easy to compress data. Specifically,
I substitute zero padding tuple for random character tuple.
Record size is same in original test data, I changed only character fo record.
Sample changed record is here.

* Original record (item table)
> 1 9830 W+ùMî/aGhÞVJ;t+Pöþm5v2î. 82.62 Tî%N#ROò|?ö;[_îë~!YäHPÜï[S!JV58Ü#;+$cPì=dãNò;=Þô5
> 2 1492 VIKëyC..UCçWSèQð2?&s÷Jf 95.78 >ýoCj'nîHR`i]cøuDH&-wì4èè}{39ámLß2mC712Tao÷
> 3 4485 oJ)kLvP^_:91BOïé 32.00 ð<èüJ÷RÝ_Jze+?é4Ü7ä-r=DÝK\\$;Fsà8ál5

* Changed sample record (item table)
> 1 9830 000000000000000000000000 95.77 00000000000000000000000000000000000000000
> 2 764 00000000000000 47.92 00000000000000000000000000000000000000000000000000
> 3 4893 000000000000000000000 15.90 00000000000000000000000000000000000

* DBT-2 Result

@Werehouse = 340
| NOTPM | 90%tile | Average | S.Deviation
no-patched | 3319.02 | 13.606648 | 7.589 | 8.428
patched | 3341.25 | 20.132364 | 7.471 | 10.458
patched-testdata_changed| 3738.07 | 20.493533 | 3.795 | 10.003

Compression patch gets higher performance than no-patch in easy to compress test
data. It is because compression patch make archive WAL more small size, in
result, waste file cache is less than no-patch. Therefore, it was inflected
file-cache more effectively.

However, test in hard to compress test data have little bit lessor performance
than no-patch. I think it is compression overhead in pglz.

> I think that this benchmark result clearly means that the patch
> has only limited effects in the reduction of WAL volume and
> the performance improvement unless the database contains
> highly-compressible data like pgbench_accounts.
Your expectation is right. I think that low CPU cost and high compression
algorithm make your patch more better and better performance, too.

> filler. But if
> we can use other compression algorithm, maybe we can reduce
> WAL volume very much.
Yes, Please!

> I'm not sure what algorithm is good for WAL compression, though.
Community member think Snappy or lz4 is better. You'd better to select one,
or test two algorithms.

> It might be better to introduce the hook for compression of FPW
> so that users can freely use their compression module, rather
> than just using pglz_compress(). Thought?
In my memory, Andres Freund developed like this patch. Did it commit or
developing now? I have thought this idea is very good.

Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
image/png 11.5 KB
image/png 9.3 KB
image/png 10.4 KB
image/png 10.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2013-10-18 07:17:35 Re: ERROR : 'tuple concurrently updated'
Previous Message Fabien COELHO 2013-10-18 06:19:24 Re: Multiple psql -c / -f options