Re: Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Kyotaro HORIGUCHI'" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: <hlinnakangas(at)vmware(dot)com>, <noah(at)leadboat(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2012-12-10 12:34:52
Message-ID: 011401cdd6d2$c13ba440$43b2ecc0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, December 10, 2012 2:41 PM Kyotaro HORIGUCHI wrote:
> Thank you.
>
> > >heap_attr_get_length_and_check_equals:
> ..
> > >- This function returns always false for attrnum <= 0 as whole
> > > tuple or some system attrs comparison regardless of the real
> > > result, which is a bit different from the anticipation which
> > > the name gives. If you need to keep this optimization, it
> > > should have the name more specific to the purpose.
> >
> > The heap_attr_get_length_and_check_equals function is similar to
> heap_tuple_attr_equals,
> > the attrnum <= 0 check is required for heap_tuple_attr_equals.
>
> Sorry, you're right.
>
> > >haap_delta_encode:
> > >
> > >- Some misleading variable names (like match_not_found),
> > > some reatitions of similiar codelets (att_align_pointer,
> pglz_out_tag),
> > > misleading slight difference of the meanings of variables of
> > > similar names(old_off and new_off and the similar pairs),
> > > and bit tricky use of pglz_out_add and pglz_out_tag with length =
> 0.
> > >
> > > These are welcome to be modified for better readability.
> >
> > The variable names are modified, please check them once.
> >
> > The (att_align_pointer, pglz_out_tag) repetition code is added to take
> care of padding only incase of values are equal.
> > Use of pglz_out_add and pglz_out_tag with length = 0 is done because
> of code readability.
>
> Oops! Sorry for mistake. My point was that the bases for old_off
> (of match_off) and dp, not new_off. It is no unnatural. Namings
> had not been the problem and the function was perfect as of the
> last patch.

I think new naming I have done are more meaningful, do you think I should
revert to previous patch one's.

> I'd been confised by the asymmetry between match_off
> to pglz_out_tag and dp to pglz_out_add.

If we see the usage of pglz_out_tag and pglz_out_literal in pglz_compress(),
it is same as I have used.


> > Another change is also done to handle the history size of 2 bytes
> which is possible with the usage of LZ macro's for delta encoding.
>
> Good catch. This seems to have been a potential bug which does no
> harm when called from pglz_compress..
>
> ==========
>
> Looking into wal_update_changes_mod_lz_v6.patch, I understand
> that this patch experimentally adds literal data segment which
> have more than single byte in PG-LZ algorithm. According to
> pglz_find_match, memCMP is slower than 'while(*s && *s == *d)' if
> len < 16 and I suppose it is probably true at least for 4 byte
> length data. This is also applied on encoding side. If this mod
> does no harm to performance, I want to see this applied also to
> pglz_comress.

Where in pglz_comress(), do you want to see similar usage?
Or do you want to see such use in function
heap_attr_get_length_and_check_equals(), where it compares 2 attributes.

> By the way, the comment on pg_lzcompress.c:690 seems to quite
> differ from what the code does.

I shall fix this.

With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2012-12-10 12:46:44 Re: Switching timeline over streaming replication
Previous Message Heikki Linnakangas 2012-12-10 11:50:58 Re: [BUG?] lag of minRecoveryPont in archive recovery