Re: Speed up Clog Access by increasing CLOG buffers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-04-02 11:55:50
Message-ID: CAA4eK1Kxcv8aj1GfWWcU2aByiKT4-DBh_STdXoccVBVBqVbL5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 31, 2016 at 3:48 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2016-03-31 15:07:22 +0530, Amit Kapila wrote:
> > On Thu, Mar 31, 2016 at 4:39 AM, Andres Freund <andres(at)anarazel(dot)de>
wrote:
> > >
> > > On 2016-03-28 22:50:49 +0530, Amit Kapila wrote:
> > > > On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila <
amit(dot)kapila16(at)gmail(dot)com>
> > > > wrote:
> > > > >
> > >
> > > Amit, could you run benchmarks on your bigger hardware? Both with
> > > USE_CONTENT_LOCK commented out and in?
> > >
> >
> > Yes.
>
> Cool.
>

Here is the performance data (configuration of machine used to perform this
test is mentioned at end of mail):

Non-default parameters
------------------------------------
max_connections = 300
shared_buffers=8GB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout =35min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 256MB

median of 3, 20-min pgbench tpc-b results for --unlogged-tables

Client Count/No. Of Runs (tps) 2 64 128
HEAD+clog_buf_128 4930 66754 68818
group_clog_v8 5753 69002 78843
content_lock 5668 70134 70501
nocontent_lock 4787 69531 70663

I am not exactly sure why using content lock (defined USE_CONTENT_LOCK in
0003-Use-a-much-more-granular-locking-model-for-the-clog-) patch or no
content lock (not defined USE_CONTENT_LOCK) patch gives poor performance at
128 client, may it is due to some bug in patch or due to some reason
mentioned by Robert [1] (usage of two locks instead of one). On running it
many-2 times with content lock and no content lock patch, some times it
gives 80 ~ 81K TPS at 128 client count which is approximately 3% higher
than group_clog_v8 patch which indicates that group clog approach is able
to address most of the remaining contention (after increasing clog buffers)
around CLOGControlLock. There is one small regression observed with no
content lock patch at lower client count (2) which might be due to
run-to-run variation or may be it is due to increased number of
instructions due to atomic ops, need to be investigated if we want to
follow no content lock approach.

Note, I have not posted TPS numbers with HEAD, as I have already shown
above that increasing clog bufs has increased TPS from ~36K to ~68K at 128
client-count.

M/c details
-----------------
Power m/c config (lscpu)
-------------------------------------
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 24
NUMA node(s): 4
Model: IBM,8286-42A
L1d cache: 64K
L1i cache: 32K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-47
NUMA node1 CPU(s): 48-95
NUMA node2 CPU(s): 96-143
NUMA node3 CPU(s): 144-191

[1] -
http://www.postgresql.org/message-id/CA+TgmoYjpNKdHDFUtJLAMna-O5LGuTDnanHFAOT5=hN_VAuW2Q@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-04-02 12:12:47 Re: snapshot too old, configured by time
Previous Message David Rowley 2016-04-02 10:26:35 Re: Performance improvement for joins where outer side is unique