Re: Reduce ProcArrayLock contention

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduce ProcArrayLock contention
Date: 2015-08-27 13:47:04
Message-ID: CAA4eK1LH2OgmX4ZaC6Zn8yhm4_evUD8vkSsZxrH0igsAOdWd1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 25, 2015 at 5:21 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Thu, Aug 20, 2015 at 3:49 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > How hard did you try checking whether this causes regressions? This
> > increases the number of atomics in the commit path a fair bit. I doubt
> > it's really bad, but it seems like a good idea to benchmark something
> > like a single full-throttle writer and a large number of readers.
>
> One way to test this is run pgbench read load (with 100 client count) and
> write load (tpc-b - with one client) simultaneously and check the results.
> I have tried this and there is lot of variation(more than 50%) in tps in
> different runs of write load, so not sure if this is the right way to
> benchmark it.
>
> Another possible way is to hack pgbench code and make one thread run
> write transaction and others run read transactions.

I have hacked pgbench to achieve single-writer-multi-reader test and below
are results:

M/c Configuration
-----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB

Non-default parameters
------------------------------------
max_connections = 150
shared_buffers=8GB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout =30min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 256MB

Data is for 3, 15 minutes pgbench (1-Writer, 127-Readers) test runs

Without ProcArrayLock optimization-
Commitid – 253de7e1
Client Count/No. Of Runs (tps) 128 Run-1 208011 Run-2 471598 Run-3 218295

With ProcArrayLock optimization -
Commitid – 0e141c0f
Client Count/No. Of Runs (tps) 128 Run-1 222839 Run-2 469483 Run-3 215791

It seems the test runs get dominated by I/O due to writer client which
leads to variation in performance numbers. In general, I don't see any
noticeable difference in performance with or without procarraylock
optimisation. I have tried even by turning off synchronous_commit and
fsync, but the results are quite similar.

pgbench modifications
-----------------------------------
Introduced a new type of test run with -W option which means single
writer and multi-reader, example if user has given 128 clients and 128
threads, it will use 1-Thread for Write (Update) transaction and 127 for
Select Only transaction. This works specifically for this use case as
I had no intention to make a generic test. Please note, it will work
properly
if number of clients and threads input by user are same. Attached find
the pgbench patch, I have used for this test. Note that, I have used
-W option in pgbench run as mentioned in below steps.

Test steps for each Run
--------------------------------------------------------------------------------------------------------
1. Start Server
2. dropdb postgres
3. createdb posters
4. pgbench -i -s 300 postgres
5. pgbench -c $threads -j $threads -T 1800 -M prepared -W postgres
6. checkpoint
7. Stop Server

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
pgbench_singlewriter_multireader_v1.patch application/octet-stream 4.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-08-27 13:51:07 Re: What does RIR as in fireRIRrules stand for?
Previous Message Fabien COELHO 2015-08-27 13:46:39 Re: checkpointer continuous flushing