Re: Performance degradation in commit 6150a1b0

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance degradation in commit 6150a1b0
Date: 2016-03-23 08:29:20
Message-ID: CAE9k0P=Cd1kYUdS_kE5daFu9a9W9txSvGZCDJQMJTADByF=Enw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

I have been working on this issue for last few days trying to investigate
what could be the probable reasons for Performance degradation at commit
6150a1b0. After going through Andres patch for moving buffer I/O and
content lock out of Main Tranche, the following two things come into my
mind.

1. Content Lock is no more used as a pointer in BufferDesc structure
instead it is included as LWLock structure. This basically increases the
overall structure size from 64bytes to 80 bytes. Just to investigate on
this, I have reverted the changes related to content lock from commit
6150a1b0 and taken at least 10 readings and with this change i can see that
the overall performance is similar to what it was observed earlier i.e.
before commit 6150a1b0.

2. Secondly, i can see that the BufferDesc structure padding is 64 bytes
however the PG CACHE LINE ALIGNMENT is 128 bytes. Also, after changing the
BufferDesc structure padding size to 128 bytes along with the changes
mentioned in above point #1, I see that the overall performance is again
similar to what is observed before commit 6150a1b0.

Please have a look into the attached test report that contains the
performance test results for all the scenarios discussed above and let me
know your thoughts.

With Regards,
Ashutosh Sharma
EnterpriseDB: *http://www.enterprisedb.com <http://www.enterprisedb.com>*

On Sat, Feb 27, 2016 at 9:26 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On February 26, 2016 7:55:18 PM PST, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >On Sat, Feb 27, 2016 at 12:41 AM, Andres Freund <andres(at)anarazel(dot)de>
> >wrote:
> >>
> >> Hi,
> >>
> >> On 2016-02-25 12:56:39 +0530, Amit Kapila wrote:
> >> > From past few weeks, we were facing some performance degradation in
> >the
> >> > read-only performance bench marks in high-end machines. My
> >colleague
> >> > Mithun, has tried by reverting commit ac1d794 which seems to
> >degrade the
> >> > performance in HEAD on high-end m/c's as reported previously[1],
> >but
> >still
> >> > we were getting degradation, then we have done some profiling to
> >see
> >what
> >> > has caused it and we found that it's mainly caused by spin lock
> >when
> >> > called via pin/unpin buffer and then we tried by reverting commit
> >6150a1b0
> >> > which has recently changed the structures in that area and it turns
> >out
> >> > that reverting that patch, we don't see any degradation in
> >performance.
> >> > The important point to note is that the performance degradation
> >doesn't
> >> > occur every time, but if the tests are repeated twice or thrice, it
> >> > is easily visible.
> >>
> >> > m/c details
> >> > IBM POWER-8
> >> > 24 cores,192 hardware threads
> >> > RAM - 492GB
> >> >
> >> > Non-default postgresql.conf settings-
> >> > shared_buffers=16GB
> >> > max_connections=200
> >> > min_wal_size=15GB
> >> > max_wal_size=20GB
> >> > checkpoint_timeout=900
> >> > maintenance_work_mem=1GB
> >> > checkpoint_completion_target=0.9
> >> >
> >> > scale_factor - 300
> >> >
> >> > Performance at commit 43cd468cf01007f39312af05c4c92ceb6de8afd8 is
> >469002 at
> >> > 64-client count and then at
> >6150a1b08a9fe7ead2b25240be46dddeae9d98e1, it
> >> > went down to 200807. This performance numbers are median of 3
> >15-min
> >> > pgbench read-only tests. The similar data is seen even when we
> >revert
> >the
> >> > patch on latest commit. We have yet to perform detail analysis as
> >to
> >why
> >> > the commit 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 lead to
> >degradation,
> >> > but any ideas are welcome.
> >>
> >> Ugh. Especially the varying performance is odd. Does it vary between
> >> restarts, or is it just happenstance? If it's the former, we might
> >be
> >> dealing with some alignment issues.
> >>
> >
> >It varies between restarts.
> >
> >>
> >> If not, I wonder if the issue is massive buffer header contention. As
> >a
> >> LL/SC architecture acquiring the content lock might interrupt buffer
> >> spinlock acquisition and vice versa.
> >>
> >> Does applying the patch from
> >
> http://archives.postgresql.org/message-id/CAPpHfdu77FUi5eiNb%2BjRPFh5S%2B1U%2B8ax4Zw%3DAUYgt%2BCPsKiyWw%40mail.gmail.com
> >> change the picture?
> >>
> >
> >Not tried, but if this is alignment issue as you are suspecting above,
> >then
> >does it make sense to try this out?
>
> It's the other theory I had. And it's additionally useful testing
> regardless of this regression...
>
> ---
> Please excuse brevity and formatting - I am writing this on my mobile
> phone.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Attachment Content-Type Size
Performance_Results.xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 8.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-03-23 08:32:00 Re: Support for N synchronous standby servers - take 2
Previous Message Alexander Korotkov 2016-03-23 08:15:54 Re: WIP: Access method extendability