Re: [REVIEW] Re: Compression of full-page-writes

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-16 14:16:43
Message-ID: CAB7nPqRF-Tdr_LWHaOfc1MdMUpmU+1cLH6vGPKC1PDseSO8aZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 16, 2014 at 8:35 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
> On Tue, Dec 16, 2014 at 3:46 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
>> On Sat, Dec 13, 2014 at 9:36 AM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> Something to be aware of btw is that this patch introduces an
>>> additional 8 bytes per block image in WAL as it contains additional
>>> information to control the compression. In this case this is the
>>> uint16 compress_len present in XLogRecordBlockImageHeader. In the case
>>> of the measurements done, knowing that 63638 FPWs have been written,
>>> there is a difference of a bit less than 500k in WAL between HEAD and
>>> "FPW off" in favor of HEAD. The gain with compression is welcome,
>>> still for the default there is a small price to track down if a block
>>> is compressed or not. This patch still takes advantage of it by not
>>> compressing the hole present in page and reducing CPU work a bit.
>>
>> That sounds like a pretty serious problem to me.
> OK. If that's so much a problem, I'll switch back to the version using
> 1 bit in the block header to identify if a block is compressed or not.
> This way, when switch will be off the record length will be the same
> as HEAD.
And here are attached fresh patches reducing the WAL record size to what it
is in head when the compression switch is off. Looking at the logic in
xlogrecord.h, the block header stores the hole length and hole offset. I
changed that a bit to store the length of raw block, with hole or
compressed as the 1st uint16. The second uint16 is used to store the hole
offset, same as HEAD when compression switch is off. When compression is
on, a special value 0xFFFF is saved (actually only filling 1 in the 16th
bit is fine...). Note that this forces to fill in the hole with zeros and
to compress always BLCKSZ worth of data.
Those patches pass make check-world, even WAL replay on standbys.

I have done as well measurements using this patch set, with the following
things that can be noticed:
- When compression switch is off, the same quantity of WAL as HEAD is
produced
- pglz is very bad at compressing page hole. I mean, really bad. Have a
look at the user CPU particularly when pages are empty and you'll
understand... Other compression algorithms would be better here. Tests are
done with various values of fillfactor, 10 means that after the update 80%
of the page is empty, at 50% the page is more or less completely full.

Here are the results, with 5 test cases:
- FPW on + 2 bytes, compression switch is on, using 2 additional bytes in
block header, resulting in WAL records longer as 8 more bytes are used per
block with lower CPU usage as page holes are not compressed by pglz.
- FPW off + 2 bytes, same as previous, with compression switch to on.
- FPW on + 0 bytes, compression switch to on, the same block header size as
HEAD is used, at the cost of compressing page holes filled with zeros
- FPW on + 0 bytes, compression switch to off, same as previous
- HEAD, unpatched master (except with hack to calculate user and system CPU)
- Record, the record-level compression, with compression lower-bound set at
0.

=# select test || ', ffactor ' || ffactor, pg_size_pretty(post_update -
pre_update), user_diff, system_diff from results;
?column? | pg_size_pretty | user_diff | system_diff
-------------------------------+----------------+-----------+-------------
FPW on + 2 bytes, ffactor 50 | 582 MB | 42.391894 | 0.807444
FPW on + 2 bytes, ffactor 20 | 229 MB | 14.330304 | 0.729626
FPW on + 2 bytes, ffactor 10 | 117 MB | 7.335442 | 0.570996
FPW off + 2 bytes, ffactor 50 | 746 MB | 25.330391 | 1.248503
FPW off + 2 bytes, ffactor 20 | 293 MB | 10.537475 | 0.755448
FPW off + 2 bytes, ffactor 10 | 148 MB | 5.762775 | 0.763761
FPW on + 0 bytes, ffactor 50 | 585 MB | 54.115496 | 0.924891
FPW on + 0 bytes, ffactor 20 | 234 MB | 26.270404 | 0.755862
FPW on + 0 bytes, ffactor 10 | 122 MB | 19.540131 | 0.800981
FPW off + 0 bytes, ffactor 50 | 746 MB | 25.102241 | 1.110677
FPW off + 0 bytes, ffactor 20 | 293 MB | 9.889374 | 0.749884
FPW off + 0 bytes, ffactor 10 | 148 MB | 5.286767 | 0.682746
HEAD, ffactor 50 | 746 MB | 25.181729 | 1.133433
HEAD, ffactor 20 | 293 MB | 9.962242 | 0.765970
HEAD, ffactor 10 | 148 MB | 5.693426 | 0.775371
Record, ffactor 50 | 582 MB | 54.904374 | 0.678204
Record, ffactor 20 | 229 MB | 19.798268 | 0.807220
Record, ffactor 10 | 116 MB | 9.401877 | 0.668454
(18 rows)

Attached are as well the results of the measurements, and the test case
used.
Regards,
--
Michael

Attachment Content-Type Size
20141216_fpw_compression_v7.tar.gz application/x-gzip 19.3 KB
results.sql application/octet-stream 1.7 KB
test_compress application/octet-stream 656 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Shulgin 2014-12-16 14:22:44 Re: REVIEW: Track TRUNCATE via pgstat
Previous Message Heikki Linnakangas 2014-12-16 14:12:40 Re: WALWriter active during recovery