Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>, "'Kyotaro HORIGUCHI'" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: <hlinnakangas(at)vmware(dot)com>, <noah(at)leadboat(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2012-12-28 11:27:23
Message-ID: 00a201cde4ee$4f759df0$ee60d9d0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, December 28, 2012 3:52 PM Simon Riggs wrote:
> On 28 December 2012 08:07, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> > Hello, I saw this patch and confirmed that
> >
> > - Coding style looks good.
> > - Appliable onto HEAD.
> > - Some mis-codings are fixed.
>
> I've had a quick review of the patch to see how close we're getting.
> The perf tests look to me like we're getting what we wanted from this
> and I'm happy with the recovery performance trade-offs. Well done to
> both author and testers.
>
> My comments
>
> * There is a fixed 75% heuristic in the patch. Can we document where
> that came from?

It is from LZ compression strategy. Refer PGLZ_Strategy.
I will add comment for it.

> Can we have a parameter that sets that please? This
> can be used to have further tests to confirm the useful setting of
> this. I expect it to be removed before we release, but it will help
> during beta.

I shall add that for test purpose.

> * The compression algorithm depends completely upon new row length
> savings. If the new row is short, it would seem easier to just skip
> the checks and include it anyway. We can say if old and new vary in
> length by > 50% of each other, just include new as-is, since the rows
> very clearly differ in a big way.

I think it makes more sense. So I shall update the patch.

> Also, if tuple is same length as
> before, can we compare the whole tuple at once to save doing
> per-column checks?

I shall evaluate and discuss with you.

> * If full page writes is on and the page is very old, we are just
> going to copy the whole block. So why not check for that rather than
> do all these push ups and then just copy the page anyway?

I shall check once and update the patch.

> * TOAST is not handled at all. No comments about it, nothing. Does
> that mean it hasn't been considered? Or did we decide not to care in
> this release?

> Presumably that means we are comparing toast pointers
> byte by byte to see if they are the same?

Yes, currently this patch is doing byte by byte comparison for toast
pointers. I shall add comment.
In future, we can evaluate if further optimizations can be done.

> * I'd like to see a specific test in regression that is designed to
> exercise the code here. That way we will be certain that the code is
> getting regularly tested.

I shall add more specific tests.


> * The internal docs are completely absent. We need at least a whole
> page of descriptive comment, discussing trade-offs and design
> decisions. This is very important because it will help locate bugs
> much faster if these things are clealry documented. It also helps
> reviewers. This is a big timewaster for committers because you have to
> read the whole patch and understand it before you can attempt to form
> opinions. Commits happen quicker and easier with good comments.

Do you have any suggestion for where to put this information, any particular
ReadMe?

> * Lots of typos in comments. Many comments say nothing more than the
> words already used in the function name itself
>
> * "flags" variables are almost always int or uint in PG source.

> * PGLZ_HISTORY_SIZE needs to be documented in the place it is defined,
> not the place its used. The test if (oldtuplen < PGLZ_HISTORY_SIZE)
> really needs to be a test inside the compression module to maintain
> better modularity, so the value itself needn't be exported

I shall update the patch to address it.


> * Need to mention the WAL format change, or include the change within
> the patch so we can see

Sure, I will update this in code comments and internals docs.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-12-28 11:28:41 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Simon Riggs 2012-12-28 10:21:38 Re: Performance Improvement by reducing WAL for Update Operation