Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>, <noah(at)leadboat(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Date: 2012-09-28 13:33:27
Message-ID: 007e01cd9d7d$d7f34690$87d9d3b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thursday, September 27, 2012 6:39 PM Amit Kapila wrote:
> > On Thursday, September 27, 2012 4:12 PM Heikki Linnakangas wrote:
> > On 25.09.2012 18:27, Amit Kapila wrote:
> > > If you feel it is must to do the comparison, we can do it in same
> way
> > > as we identify for HOT?
> >
> > Yeah. (But as discussed, I think it would be even better to just
> treat
> > the old and new tuple as an opaque chunk of bytes, and run them
> through
> > a generic delta algorithm).
> >
>
> Thank you for the modified patch.
>
> > The conclusion is that there isn't very much difference among the
> > patches. They all squeeze the WAL to about the same size, and the
> > increase in TPS is roughly the same.
> >
> > I think more performance testing is required. The modified pgbench
> test
> > isn't necessarily very representative of a real-life application. The
> > gain (or loss) of this patch is going to depend a lot on how many
> > columns are updated, and in what ways. Need to test more scenarios,
> > with many different database schemas.

I have done for few and planning for doing more.

> Now I shall do the various tests for following and post it here:
> a. Attached Patch in the mode where it takes advantage of history tuple
> b. By changing the logic for modified column calculation to use
> calculation
> for memcmp()

Attached documents contain data for following scenarios for both 'a' (LZ
compression patch) and 'b' (modified wal patch) patches:

1. Using fixed string (last few bytes are random) to update the column
values.
Total record length = 1800
Updated columns length = 250
2. Using random string to update the column values
Total record length = 1800
Updated columns length = 250

Observations -
1. With both patches performance increase is very good .
2. Almost same performance increase with both patches with slightly more
for LZ compression patch.
3. TPS is varying with LZ patch, but if we take average it is equivalent to
other patch.

Other Performance tests I am planning to conduct
1. Using bigger random string to update the column values
Total record length = 1800
Updated columns length = 250
2. Using fixed string (last few bytes are random) to update the column
values.
Total record length = 1800
Updated columns length = 50, 100, 500, 750, 1000, 1500, 1800
3. Recovery performance test as suggested by Noah
4. Complete testing for LZ compression patch using testcases defined for
original patch

Kindly suggest more performance test cases which can make findings concrete
or incase you feel
above is sufficient then please confirm.

With Regards,
Amit Kapila.

Attachment Content-Type Size
pgbench_wal_modified_and_lz_fixed_test.htm text/html 27.2 KB
pgbench_fixed.c application/octet-stream 63.4 KB
pgbench_random.c application/octet-stream 63.3 KB
pgbench_wal_modified_and_lz_random_test.htm text/html 27.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marko Kreen 2012-09-28 14:08:41 Re: [9.1] 2 bugs with extensions
Previous Message Amit Kapila 2012-09-28 13:08:26 Re: Switching timeline over streaming replication