Re: Compression of full-page-writes

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Compression of full-page-writes
Date: 2013-09-30 03:49:42
Message-ID: CAHGQGwF+KcJfzHmvK=_aD7PecVDsP1OA2sEz6-JuYeJtcVq1hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 11, 2013 at 7:39 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Fri, Aug 30, 2013 at 11:55 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> Hi,
>>
>> Attached patch adds new GUC parameter 'compress_backup_block'.
>> When this parameter is enabled, the server just compresses FPW
>> (full-page-writes) in WAL by using pglz_compress() before inserting it
>> to the WAL buffers. Then, the compressed FPW is decompressed
>> in recovery. This is very simple patch.
>>
>> The purpose of this patch is the reduction of WAL size.
>> Under heavy write load, the server needs to write a large amount of
>> WAL and this is likely to be a bottleneck. What's the worse is,
>> in replication, a large amount of WAL would have harmful effect on
>> not only WAL writing in the master, but also WAL streaming and
>> WAL writing in the standby. Also we would need to spend more
>> money on the storage to store such a large data.
>> I'd like to alleviate such harmful situations by reducing WAL size.
>>
>> My idea is very simple, just compress FPW because FPW is
>> a big part of WAL. I used pglz_compress() as a compression method,
>> but you might think that other method is better. We can add
>> something like FPW-compression-hook for that later. The patch
>> adds new GUC parameter, but I'm thinking to merge it to full_page_writes
>> parameter to avoid increasing the number of GUC. That is,
>> I'm thinking to change full_page_writes so that it can accept new value
>> 'compress'.
>
> Done. Attached is the updated version of the patch.
>
> In this patch, full_page_writes accepts three values: on, compress, and off.
> When it's set to compress, the full page image is compressed before it's
> inserted into the WAL buffers.
>
> I measured how much this patch affects the performance and the WAL
> volume again, and I also measured how much this patch affects the
> recovery time.
>
> * Server spec
> CPU: 8core, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
> Mem: 16GB
> Disk: 500GB SSD Samsung 840
>
> * Benchmark
> pgbench -c 32 -j 4 -T 900 -M prepared
> scaling factor: 100
>
> checkpoint_segments = 1024
> checkpoint_timeout = 5min
> (every checkpoint during benchmark were triggered by checkpoint_timeout)
>
> * Result
> [tps]
> 1344.2 (full_page_writes = on)
> 1605.9 (compress)
> 1810.1 (off)
>
> [the amount of WAL generated during running pgbench]
> 4422 MB (on)
> 1517 MB (compress)
> 885 MB (off)

On second thought, the patch could compress WAL very much because I
used pgbench.
Most of data in pgbench are pgbench_accounts table's "filler" columns, i.e.,
blank-padded empty strings. So, the compression ratio of WAL was very high.

I will do the same measurement by using another benchmark.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2013-09-30 04:05:35 Re: gaussian distribution pgbench
Previous Message Alvaro Herrera 2013-09-30 03:43:33 Re: review: psql and pset without any arguments