Re: [REVIEW] Re: Compression of full-page-writes

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-12 15:04:29
Message-ID: 20141212150429.GL31413@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-12-12 23:50:43 +0900, Michael Paquier wrote:
> I got curious to see how the compression of an entire record would perform
> and how it compares for small WAL records, and here are some numbers based
> on the patch attached, this patch compresses the whole record including the
> block headers, letting only XLogRecord out of it with a flag indicating
> that the record is compressed (note that this patch contains a portion for
> replay untested, still this patch gives an idea on how much compression of
> the whole record affects user CPU in this test case). It uses a buffer of 4
> * BLCKSZ, if the record is longer than that compression is simply given up.
> Those tests are using the hack upthread calculating user and system CPU
> using getrusage() when a backend.
>
> Here is the simple test case I used with 512MB of shared_buffers and small
> records, filling up a bunch of buffers, dirtying them and them compressing
> FPWs with a checkpoint.
> #!/bin/bash
> psql <<EOF
> SELECT pg_backend_pid();
> CREATE TABLE aa (a int);
> CREATE TABLE results (phase text, position pg_lsn);
> CREATE EXTENSION IF NOT EXISTS pg_prewarm;
> ALTER TABLE aa SET (FILLFACTOR = 50);
> INSERT INTO results VALUES ('pre-insert', pg_current_xlog_location());
> INSERT INTO aa VALUES (generate_series(1,7000000)); -- 484MB
> SELECT pg_size_pretty(pg_relation_size('aa'::regclass));
> SELECT pg_prewarm('aa'::regclass);
> CHECKPOINT;
> INSERT INTO results VALUES ('pre-update', pg_current_xlog_location());
> UPDATE aa SET a = 7000000 + a;
> CHECKPOINT;
> INSERT INTO results VALUES ('post-update', pg_current_xlog_location());
> SELECT * FROM results;
> EOF
>
> Note that autovacuum and fsync are off.
> =# select phase, user_diff, system_diff,
> pg_size_pretty(pre_update - pre_insert),
> pg_size_pretty(post_update - pre_update) from results;
> phase | user_diff | system_diff | pg_size_pretty |
> pg_size_pretty
> --------------------+-----------+-------------+----------------+----------------
> Compression FPW | 42.990799 | 0.868179 | 429 MB | 567 MB
> No compression | 25.688731 | 1.236551 | 429 MB | 727 MB
> Compression record | 56.376750 | 0.769603 | 429 MB | 566 MB
> (3 rows)
> If we do record-level compression, we'll need to be very careful in
> defining a lower-bound to not eat unnecessary CPU resources, perhaps
> something that should be controlled with a GUC. I presume that this stands
> true as well for the upper bound.

Record level compression pretty obviously would need a lower boundary
for when to use compression. It won't be useful for small heapam/btree
records, but it'll be rather useful for large multi_insert, clean or
similar records...

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-12-12 15:06:32 Re: pg_rewind in contrib
Previous Message Heikki Linnakangas 2014-12-12 15:01:17 Re: pg_rewind in contrib