Re: Speed up Clog Access by increasing CLOG buffers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-10-20 17:59:13
Message-ID: CA+TgmobJBv0qYEMazPEqsit4zkk_ECvafYdu8X=jAnVei0yaYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 20, 2016 at 11:45 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Oct 20, 2016 at 3:36 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>> On Thu, Oct 13, 2016 at 12:25 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I agree with these conclusions. I had a chance to talk with Andres
>>> this morning at Postgres Vision and based on that conversation I'd
>>> like to suggest a couple of additional tests:
>>>
>>> 1. Repeat this test on x86. In particular, I think you should test on
>>> the EnterpriseDB server cthulhu, which is an 8-socket x86 server.
>>
>> I have done my test on cthulhu, basic difference is that In POWER we
>> saw ClogControlLock on top at 96 and more client with 300 scale
>> factor. But, on cthulhu at 300 scale factor transactionid lock is
>> always on top. So I repeated my test with 1000 scale factor as well on
>> cthulhu.
>
> So the upshot appears to be that this problem is a lot worse on power2
> than cthulhu, which suggests that this is architecture-dependent. I
> guess it could also be kernel-dependent, but it doesn't seem likely,
> because:
>
> power2: Red Hat Enterprise Linux Server release 7.1 (Maipo),
> 3.10.0-229.14.1.ael7b.ppc64le
> cthulhu: CentOS Linux release 7.2.1511 (Core), 3.10.0-229.7.2.el7.x86_64
>
> So here's my theory. The whole reason why Tomas is having difficulty
> seeing any big effect from these patches is because he's testing on
> x86. When Dilip tests on x86, he doesn't see a big effect either,
> regardless of workload. But when Dilip tests on POWER, which I think
> is where he's mostly been testing, he sees a huge effect, because for
> some reason POWER has major problems with this lock that don't exist
> on x86.
>
> If that's so, then we ought to be able to reproduce the big gains on
> hydra, a community POWER server. In fact, I think I'll go run a quick
> test over there right now...

And ... nope. I ran a 30-minute pgbench test on unpatched master
using unlogged tables at scale factor 300 with 64 clients and got
these results:

14 LWLockTranche | wal_insert
36 LWLockTranche | lock_manager
45 LWLockTranche | buffer_content
223 Lock | tuple
527 LWLockNamed | CLogControlLock
921 Lock | extend
1195 LWLockNamed | XidGenLock
1248 LWLockNamed | ProcArrayLock
3349 Lock | transactionid
85957 Client | ClientRead
135935 |

I then started a run at 96 clients which I accidentally killed shortly
before it was scheduled to finish, but the results are not much
different; there is no hint of the runaway CLogControlLock contention
that Dilip sees on power2.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-10-20 18:02:27 Re: Renaming of pg_xlog and pg_clog
Previous Message Bruce Momjian 2016-10-20 17:39:31 Re: Renaming of pg_xlog and pg_clog