Re: Compression of full-page-writes

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Compression of full-page-writes
Date: 2014-06-10 14:49:46
Message-ID: CAH2L28uKFngZj7hVWF_x_yq7r_3OSXa=VCAhK+V0abs1urvfUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello ,

In order to facilitate changing of compression algorithms and to be able
to recover using WAL records compressed with different compression
algorithms, information about compression algorithm can be stored in WAL
record.

XLOG record header has 2 to 4 padding bytes in order to align the WAL
record. This space can be used for a new flag in order to store
information about the compression algorithm used. Like the xl_info field of
XlogRecord struct, 8 bits flag can be constructed with the lower 4 bits
of the flag used to indicate which backup block is compressed out of
0,1,2,3. Higher four bits can be used to indicate state of compression i.e
off,lz4,snappy,pglz.

The flag can be extended to incorporate more compression algorithms added
in future if any.

What is your opinion on this?

Thank you,

Rahila Syed

On Tue, May 27, 2014 at 9:27 AM, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
wrote:

> Hello All,
>
> 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of
> full page writes to include LZ4 and Snappy . Changes include making
> "compress_backup_block" GUC from boolean to enum. Value of the GUC can be
> OFF, pglz, snappy or lz4 which can be used to turn off compression or set
> the desired compression algorithm.
>
> 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It
> uses Andres’s patch for getting Makefiles working and has a few wrappers to
> make the function calls to LZ4 and Snappy compression functions and handle
> varlena datatypes.
> Patch Courtesy: Pavan Deolasee
>
> These patches serve as a way to test various compression algorithms. These
> are WIP yet. They don’t support changing compression algorithms on standby
> .
> Also, compress_backup_block GUC needs to be merged with full_page_writes.
> The patch uses LZ4 high compression(HC) variant.
> I have conducted initial tests which I would like to share and solicit
> feedback
>
> Tests use JDBC runner TPC-C benchmark to measure the amount of WAL
> compression ,tps and response time in each of the scenarios viz .
> Compression = OFF , pglz, LZ4 , snappy ,FPW=off
>
> Server specifications:
> Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
> RAM: 32GB
> Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
> 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm
>
>
> Benchmark:
> Scale : 100
> Command :java JR /home/postgres/jdbcrunner-1.2/scripts/tpcc.js
> -sleepTime
> 600,350,300,250,250
> Warmup time : 1 sec
> Measurement time : 900 sec
> Number of tx types : 5
> Number of agents : 16
> Connection pool size : 16
> Statement cache size : 40
> Auto commit : false
> Sleep time : 600,350,300,250,250 msec
>
> Checkpoint segments:1024
> Checkpoint timeout:5 mins
>
>
> Scenario WAL generated(bytes) Compression
> (bytes) TPS (tx1,tx2,tx3,tx4,tx5)
> No_compress 2220787088 (~2221MB) NULL
> 13.3,13.3,1.3,1.3,1.3 tps
> Pglz 1796213760 (~1796MB) 424573328
> (19.11%) 13.1,13.1,1.3,1.3,1.3 tps
> Snappy 1724171112 (~1724MB) 496615976( 22.36%)
> 13.2,13.2,1.3,1.3,1.3 tps
> LZ4(HC) 1658941328 (~1659MB) 561845760(25.29%)
> 13.2,13.2,1.3,1.3,1.3 tps
> FPW(off) 139384320(~139 MB) NULL
> 13.3,13.3,1.3,1.3,1.3 tps
>
> As per measurement results, WAL reduction using LZ4 is close to 25% which
> shows 6 percent increase in WAL reduction when compared to pglz . WAL
> reduction in snappy is close to 22 % .
> The numbers for compression using LZ4 and Snappy doesn’t seem to be very
> high as compared to pglz for given workload. This can be due to
> in-compressible nature of the TPC-C data which contains random strings
>
> Compression does not have bad impact on the response time. In fact,
> response
> times for Snappy, LZ4 are much better than no compression with almost ½ to
> 1/3 of the response times of no-compression(FPW=on) and FPW = off.
> The response time order for each type of compression is
> Pglz>Snappy>LZ4
>
> Scenario Response time (tx1,tx2,tx3,tx4,tx5)
> no_compress 5555,1848,4221,6791,5747 msec
> pglz 4275,2659,1828,4025,3326 msec
> Snappy 3790,2828,2186,1284,1120 msec
> LZ4(hC) 2519,2449,1158,2066,2065 msec
> FPW(off) 6234,2430,3017,5417,5885 msec
>
> LZ4 and Snappy are almost at par with each other in terms of response time
> as average response times of five types of transactions remains almost same
> for both.
> 0001-CompressBackupBlock_snappy_lz4_pglz.patch
> <
> http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch
> >
> 0002-Support_snappy_lz4.patch
> <
> http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch
> >
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html
> Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-06-10 14:51:16 Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Previous Message Andres Freund 2014-06-10 14:46:51 Re: /proc/self/oom_adj is deprecated in newer Linux kernels