Skip site navigation (1) Skip section navigation (2)

Re: Performance Improvement by reducing WAL for Update Operation

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "hlinnakangas(at)vmware(dot)com" <hlinnakangas(at)vmware(dot)com>, "noah(at)leadboat(dot)com"<noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgresql(dot)org"<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2013-01-09 08:05:02
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C383BEAAE32@szxeml509-mbx (view raw or flat)
Thread:
Lists: pgsql-hackers
On Friday, January 04, 2013 8:03 PM Simon Riggs wrote:
On 4 January 2013 13:53, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Friday, December 28, 2012 3:52 PM Simon Riggs wrote:
>> On 28 December 2012 08:07, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>
>> > Hello, I saw this patch and confirmed that
>> >
>> >  - Coding style looks good.
>> >  - Appliable onto HEAD.
>> >  - Some mis-codings are fixed.
>>
>> I've had a quick review of the patch to see how close we're getting.
>> The perf tests look to me like we're getting what we wanted from this
>> and I'm happy with the recovery performance trade-offs. Well done to
>> both author and testers.
>>
Update patch contains handling of below Comments
>
>* There is a fixed 75% heuristic in the patch. Can we document where
>that came from? Can we have a parameter that sets that please? This
>can be used to have further tests to confirm the useful setting of
>this. I expect it to be removed before we release, but it will help
>during beta. 

Added a guc variable wal_update_compression_ratio to set the compression ratio. 
It can be removed during beta. 

>* The compression algorithm depends completely upon new row length
>savings. If the new row is short, it would seem easier to just skip
>the checks and include it anyway. We can say if old and new vary in
>length by > 50% of each other, just include new as-is, since the rows
>very clearly differ in a big way. 

Added a check in heap_delta_encode to identify whether the tuples are differ in length by 50%. 

>* If full page writes is on and the page is very old, we are just
>going to copy the whole block. So why not check for that rather than
>do all these push ups and then just copy the page anyway? 

Added a function which is used to identify whether the page needs a backup block or not. 
based on the result the optimization is applied.


>* I'd like to see a specific test in regression that is designed to
>exercise the code here. That way we will be certain that the code is
>getting regularly tested. 

Added the regression tests which covers all the changes done for the optimization except recovery.

>* The internal docs are completely absent. We need at least a whole
>page of descriptive comment, discussing trade-offs and design
>decisions. This is very important because it will help locate bugs
>much faster if these things are clealry documented. It also helps
>reviewers. This is a big timewaster for committers because you have to
>read the whole patch and understand it before you can attempt to form
>opinions. Commits happen quicker and easier with good comments.
>* Need to mention the WAL format change, or include the change within
>the patch so we can see 

backend/access/transam/README is updated with details. 

>* Lots of typos in comments. Many comments say nothing more than the
>words already used in the function name itself

corrected the typos and removed unnecessary comments. 

>* "flags" variables are almost always int or uint in PG source.
>
>* PGLZ_HISTORY_SIZE needs to be documented in the place it is defined,
>not the place its used. The test if (oldtuplen < PGLZ_HISTORY_SIZE)
>really needs to be a test inside the compression module to maintain
>better modularity, so the value itself needn't be exported

(oldtuplen < PGLZ_HISTORY_SIZE) validation is moved inside the heap_delta_encode 
and updated the flags variable also. 

Test results with modified pgbench (1800 record size) on the latest patch: 

-Patch-             -tps(at)-c1-     -WAL(at)-c1-      -tps(at)-c2-      -WAL(at)-c2-
Head                831           4.17 GB        1416           7.13 GB
WAL modification    846           2.36 GB        1712           3.31 GB

-Patch-             -tps(at)-c4-     -WAL(at)-c4-      -tps(at)-c8-      -WAL(at)-c8-
Head                2196          11.01 GB       2758           13.88 GB
WAL modification    3295           5.87 GB       5472            9.02 GB


With Regards,
Amit Kapila. 

Attachment: wal_update_changes_v7.patch
Description: application/octet-stream (60.4 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2013-01-09 08:18:08
Subject: Re: PL/perl should fail on configure, not make
Previous:From: Benedikt GrundmannDate: 2013-01-09 07:56:12
Subject: Re: Improve compression speeds in pg_lzcompress.c

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group