On Friday, January 04, 2013 8:03 PM Simon Riggs wrote:
On 4 January 2013 13:53, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
> On Friday, December 28, 2012 3:52 PM Simon Riggs wrote:
>> On 28 December 2012 08:07, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> > Hello, I saw this patch and confirmed that
>> > - Coding style looks good.
>> > - Appliable onto HEAD.
>> > - Some mis-codings are fixed.
>> I've had a quick review of the patch to see how close we're getting.
>> The perf tests look to me like we're getting what we wanted from this
>> and I'm happy with the recovery performance trade-offs. Well done to
>> both author and testers.
Update patch contains handling of below Comments
>* There is a fixed 75% heuristic in the patch. Can we document where
>that came from? Can we have a parameter that sets that please? This
>can be used to have further tests to confirm the useful setting of
>this. I expect it to be removed before we release, but it will help
Added a guc variable wal_update_compression_ratio to set the compression ratio.
It can be removed during beta.
>* The compression algorithm depends completely upon new row length
>savings. If the new row is short, it would seem easier to just skip
>the checks and include it anyway. We can say if old and new vary in
>length by > 50% of each other, just include new as-is, since the rows
>very clearly differ in a big way.
Added a check in heap_delta_encode to identify whether the tuples are differ in length by 50%.
>* If full page writes is on and the page is very old, we are just
>going to copy the whole block. So why not check for that rather than
>do all these push ups and then just copy the page anyway?
Added a function which is used to identify whether the page needs a backup block or not.
based on the result the optimization is applied.
>* I'd like to see a specific test in regression that is designed to
>exercise the code here. That way we will be certain that the code is
>getting regularly tested.
Added the regression tests which covers all the changes done for the optimization except recovery.
>* The internal docs are completely absent. We need at least a whole
>page of descriptive comment, discussing trade-offs and design
>decisions. This is very important because it will help locate bugs
>much faster if these things are clealry documented. It also helps
>reviewers. This is a big timewaster for committers because you have to
>read the whole patch and understand it before you can attempt to form
>opinions. Commits happen quicker and easier with good comments.
>* Need to mention the WAL format change, or include the change within
>the patch so we can see
backend/access/transam/README is updated with details.
>* Lots of typos in comments. Many comments say nothing more than the
>words already used in the function name itself
corrected the typos and removed unnecessary comments.
>* "flags" variables are almost always int or uint in PG source.
>* PGLZ_HISTORY_SIZE needs to be documented in the place it is defined,
>not the place its used. The test if (oldtuplen < PGLZ_HISTORY_SIZE)
>really needs to be a test inside the compression module to maintain
>better modularity, so the value itself needn't be exported
(oldtuplen < PGLZ_HISTORY_SIZE) validation is moved inside the heap_delta_encode
and updated the flags variable also.
Test results with modified pgbench (1800 record size) on the latest patch:
-Patch- -tps(at)-c1- -WAL(at)-c1- -tps(at)-c2- -WAL(at)-c2-
Head 831 4.17 GB 1416 7.13 GB
WAL modification 846 2.36 GB 1712 3.31 GB
-Patch- -tps(at)-c4- -WAL(at)-c4- -tps(at)-c8- -WAL(at)-c8-
Head 2196 11.01 GB 2758 13.88 GB
WAL modification 3295 5.87 GB 5472 9.02 GB
In response to
pgsql-hackers by date
|Next:||From: Andrew Dunstan||Date: 2013-01-09 08:18:08|
|Subject: Re: PL/perl should fail on configure, not make|
|Previous:||From: Benedikt Grundmann||Date: 2013-01-09 07:56:12|
|Subject: Re: Improve compression speeds in pg_lzcompress.c|