Re: [REVIEW] Re: Compression of full-page-writes

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Rahila Syed <rahilasyed90(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-28 13:57:19
Message-ID: CAB7nPqQOOzd5FLVkg-SN1cFf5Pi2ky3LTQecoBtS2Ws+jq=A2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 26, 2014 at 4:16 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
> On Fri, Dec 26, 2014 at 3:24 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
wrote:
>> pglz_compress() and pglz_decompress() still use PGLZ_Header, so the
frontend
>> which uses those functions needs to handle PGLZ_Header. But it basically
should
>> be handled via the varlena macros. That is, the frontend still seems to
need to
>> understand the varlena datatype. I think we should avoid that. Thought?
> Hm, yes it may be wiser to remove it and make the data passed to pglz
> for varlena 8 bytes shorter..

OK, here is the result of this work, made of 3 patches.

The first two patches move pglz stuff to src/common and make it a frontend
utility entirely independent on varlena and its related metadata.
- Patch 1 is a simple move of pglz to src/common, with PGLZ_Header still
present. There is nothing amazing here, and that's the broken version that
has been reverted in 966115c.
- The real stuff comes with patch 2, that implements the removal of
PGLZ_Header, changing the APIs of compression and decompression to pglz to
not have anymore toast metadata, this metadata being now localized in
tuptoaster.c. Note that this patch protects the on-disk format (tested with
pg_upgrade from 9.4 to a patched HEAD server). Here is how the APIs of
compression and decompression look like with this patch, simply performing
operations from a source to a destination:
extern int32 pglz_compress(const char *source, int32 slen, char *dest,
const PGLZ_Strategy *strategy);
extern int32 pglz_decompress(const char *source, char *dest,
int32 compressed_size, int32 raw_size);
The return value of those functions is the number of bytes written in the
destination buffer, and 0 if operation failed. This is aimed to make
backend as well more pluggable. The reason why patch 2 exists (it could be
merged with patch 1), is to facilitate the review and the changes made to
pglz to make it an entirely independent facility.

Patch 3 is the FPW compression, changed to fit with those changes. Note
that as PGLZ_Header contains the raw size of the compressed data, and that
it does not exist, it is necessary to store the raw length of the block
image directly in the block image header with 2 additional bytes. Those 2
bytes are used only if wal_compression is set to true thanks to a boolean
flag, so if wal_compression is disabled, the WAL record length is exactly
the same as HEAD, and there is no penalty in the default case. Similarly to
previous patches, the block image is compressed without its hole.

To finish, here are some results using the same test as here with the hack
on getrusage to get the system and user CPU diff on a single backend
execution:
http://www.postgresql.org/message-id/CAB7nPqSc97o-UE5paxfMUKWcxE_JioyxO1M4A0pMnmYqAnec2g@mail.gmail.com
Just as a reminder, this test generated a fixed number of FPWs on a single
backend with fsync and autovacuum disabled with several values of
fillfactor to see the effect of page holes.

test | ffactor | user_diff | system_diff | pg_size_pretty
---------+---------+-----------+-------------+----------------
FPW on | 50 | 48.823907 | 0.737649 | 582 MB
FPW on | 20 | 16.135000 | 0.764682 | 229 MB
FPW on | 10 | 8.521099 | 0.751947 | 116 MB
FPW off | 50 | 29.722793 | 1.045577 | 746 MB
FPW off | 20 | 12.673375 | 0.905422 | 293 MB
FPW off | 10 | 6.723120 | 0.779936 | 148 MB
HEAD | 50 | 30.763136 | 1.129822 | 746 MB
HEAD | 20 | 13.340823 | 0.893365 | 293 MB
HEAD | 10 | 7.267311 | 0.909057 | 148 MB
(9 rows)

Results are similar to what has been measured previously, it doesn't hurt
to check again, but roughly the CPU cost is balanced by the WAL record
reduction. There is 0 byte of difference in term of WAL record length
between HEAD this patch when wal_compression = off.

Patches, as well as the test script and the results are attached.
Regards,
--
Michael

Attachment Content-Type Size
results.sql application/octet-stream 1.0 KB
test_compress application/octet-stream 656 bytes
20141228_fpw_compression_v12.tar.gz application/x-gzip 23.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-12-28 14:06:11 Re: attaching a process in eclipse
Previous Message Craig Ringer 2014-12-28 10:56:04 Re: attaching a process in eclipse