Re: Speed up Clog Access by increasing CLOG buffers

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2017-09-01 14:03:19
Message-ID: CAFiTN-sEDY-AmemEdqBmROrqurCPqwAbG9sbkyhP1zB2CmieVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 30, 2017 at 12:54 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> That would have been better. In any case, will do the tests on some
> higher end machine and will share the results.
>
>> Given that we've changed the approach here somewhat, I think we need
>> to validate that we're still seeing a substantial reduction in
>> CLogControlLock contention on big machines.
>>
>
> Sure will do so. In the meantime, I have rebased the patch.

I have repeated some of the tests we have performed earlier.

Machine:
Intel 8 socket machine with 128 core.

Configuration:

shared_buffers=8GB
checkpoint_timeout=40min
max_wal_size=20GB
max_connections=300
maintenance_work_mem=4GB
synchronous_commit=off
checkpoint_completion_target=0.9

I have run taken one reading for each test to measure the wait event.
Observation is same that at higher client count there is a significant
reduction in the contention on ClogControlLock.

Benchmark: Pgbench simple_update, 30 mins run:

Head: (64 client) : (TPS 60720)
53808 Client | ClientRead
26147 IPC | ProcArrayGroupUpdate
7866 LWLock | CLogControlLock
3705 Activity | LogicalLauncherMain
3699 Activity | AutoVacuumMain
3353 LWLock | ProcArrayLoc
3099 LWLock | wal_insert
2825 Activity | BgWriterMain
2688 Lock | extend
1436 Activity | WalWriterMain

Patch: (64 client) : (TPS 67207)
53235 Client | ClientRead
29470 IPC | ProcArrayGroupUpdate
4302 LWLock | wal_insert
3717 Activity | LogicalLauncherMain
3715 Activity | AutoVacuumMain
3463 LWLock | ProcArrayLock
3140 Lock | extend
2934 Activity | BgWriterMain
1434 Activity | WalWriterMain
1198 Activity | CheckpointerMain
1073 LWLock | XidGenLock
869 IPC | ClogGroupUpdate

Head:(72 Client): (TPS 57856)

55820 Client | ClientRead
34318 IPC | ProcArrayGroupUpdate
15392 LWLock | CLogControlLock
3708 Activity | LogicalLauncherMain
3705 Activity | AutoVacuumMain
3436 LWLock | ProcArrayLock

Patch:(72 Client): (TPS 65740)

60356 Client | ClientRead
38545 IPC | ProcArrayGroupUpdate
4573 LWLock | wal_insert
3708 Activity | LogicalLauncherMain
3705 Activity | AutoVacuumMain
3508 LWLock | ProcArrayLock
3492 Lock | extend
2903 Activity | BgWriterMain
1903 LWLock | XidGenLock
1383 Activity | WalWriterMain
1212 Activity | CheckpointerMain
1056 IPC | ClogGroupUpdate

Head:(96 Client): (TPS 52170)

62841 LWLock | CLogControlLock
56150 IPC | ProcArrayGroupUpdate
54761 Client | ClientRead
7037 LWLock | wal_insert
4077 Lock | extend
3727 Activity | LogicalLauncherMain
3727 Activity | AutoVacuumMain
3027 LWLock | ProcArrayLock

Patch:(96 Client): (TPS 67932)

87378 IPC | ProcArrayGroupUpdate
80201 Client | ClientRead
11511 LWLock | wal_insert
4102 Lock | extend
3971 LWLock | ProcArrayLock
3731 Activity | LogicalLauncherMain
3731 Activity | AutoVacuumMain
2948 Activity | BgWriterMain
1763 LWLock | XidGenLock
1736 IPC | ClogGroupUpdate

Head:(128 Client): (TPS 40820)

182569 LWLock | CLogControlLock
61484 IPC | ProcArrayGroupUpdate
37969 Client | ClientRead
5135 LWLock | wal_insert
3699 Activity | LogicalLauncherMain
3699 Activity | AutoVacuumMain

Patch:(128 Client): (TPS 67054)

174583 IPC | ProcArrayGroupUpdate
66084 Client | ClientRead
16738 LWLock | wal_insert
4993 IPC | ClogGroupUpdate
4893 LWLock | ProcArrayLock
4839 Lock | extend

Benchmark: select for update with 3 save points, 10 mins run

Script:
\set aid random (1,30000000)
\set tid random (1,3000)

BEGIN;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid for UPDATE;
SAVEPOINT s1;
SELECT tbalance FROM pgbench_tellers WHERE tid = :tid for UPDATE;
SAVEPOINT s2;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid for UPDATE;
SAVEPOINT s3;
SELECT tbalance FROM pgbench_tellers WHERE tid = :tid for UPDATE;
END;

Head:(64 Client): (TPS 44577.1802)

53808 Client | ClientRead
26147 IPC | ProcArrayGroupUpdate
7866 LWLock | CLogControlLock
3705 Activity | LogicalLauncherMain
3699 Activity | AutoVacuumMain
3353 LWLock | ProcArrayLock
3099 LWLock | wal_insert

Patch:(64 Client): (TPS 46156.245)

53235 Client | ClientRead
29470 IPC | ProcArrayGroupUpdate
4302 LWLock | wal_insert
3717 Activity | LogicalLauncherMain
3715 Activity | AutoVacuumMain
3463 LWLock | ProcArrayLock
3140 Lock | extend
2934 Activity | BgWriterMain
1434 Activity | WalWriterMain
1198 Activity | CheckpointerMain
1073 LWLock | XidGenLock
869 IPC | ClogGroupUpdate

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-09-01 14:14:25 Re: Upcoming commit fest will begin soon
Previous Message Simon Riggs 2017-09-01 13:58:27 Re: More replication race conditions