Reduce ProcArrayLock contention

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Reduce ProcArrayLock contention
Date: 2015-06-29 15:27:38
Message-ID: CAA4eK1JbX4FzPHigNt0JSaz30a85BPJV+ewhk+wg_o-T6xufEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have been working on to analyze different ways to reduce
the contention around ProcArrayLock. I have evaluated mainly
2 ideas, first one is to partition the ProcArrayLock (the basic idea
is to allow multiple clients (equal to number of ProcArrayLock partitions)
to perform ProcArrayEndTransaction and then wait for all of them at
GetSnapshotData time) and second one is to have a mechanism to
GroupClear the Xid during ProcArrayEndTransaction() and the second
idea clearly stands out in my tests, so I have prepared the patch for that
to further discuss here.

The idea behind second approach (GroupClear Xid) is, first try to get
ProcArrayLock conditionally in ProcArrayEndTransaction(), if we get
the lock then clear the advertised XID, else set the flag (which indicates
that advertised XID needs to be clear for this proc) and push this
proc to pendingClearXidList. Except one proc, all other proc's will
wait for their Xid to be cleared. The only allowed proc will attempt the
lock acquiration, after acquring the lock, pop all of the requests off the
list using compare-and-swap, servicing each one before moving to
next proc, and clearing their Xids. After servicing all the requests
on pendingClearXidList, release the lock and once again go through
the saved pendingClearXidList and wake all the processes waiting
for their Xid to be cleared. To set the appropriate value for
ShmemVariableCache->latestCompletedXid, we need to advertise
latestXid incase proc needs to be pushed to pendingClearXidList.
Attached patch implements the above idea.

Performance Data
-----------------------------
RAM - 500GB
8 sockets, 64 cores(Hyperthreaded128 threads total)

Non-default parameters
------------------------------------
max_connections = 150
shared_buffers=8GB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout =35min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 256MB

pgbench setup
------------------------
scale factor - 300
Data is on magnetic disk and WAL on ssd.
pgbench -M prepared tpc-b

Head : commit 51d0fe5d
Patch -1 : group_xid_clearing_at_trans_end_rel_v1

Client Count/TPS18163264128HEAD814609210899199262363617812Patch-110866483
11093199083122028237

The graph for the data is attached.

Points about performance data
---------------------------------------------
1. Gives good performance improvement at or greater than 64 clients
and give somewhat moderate improvement at lower client count. The
reason is that because the contention around ProcArrayLock is mainly
seen at higher client count. I have checked that at higher client-count,
it started behaving lockless (which means performance with patch is
equivivalent to if we just comment out ProcArrayLock in
ProcArrayEndTransaction()).
2. There is some noise in this data (at 1 client count, I don't expect
much difference).
3. I have done similar tests on power-8 m/c and found similar gains.
4. The gains are visible when the data fits in shared_buffers as for other
workloads I/O starts dominating.
5. I have seen that effect of Patch is much more visible if we keep
autovacuum = off (do manual vacuum after each run) and keep
wal_writer_delay to lower value (say 20ms). I have not included that
data here, but if somebody is interested, I can do the detailed tests
against HEAD with those settings and share the results.

Here are steps used to take data (there are repeated for each reading)
--------------------------------------------------------------------------------------------------------
1. Start Server
2. dropdb postgres
3. createdb posters
4. pgbench -i -s 300 postgres
5. pgbench -c $threads -j $threads -T 1800 -M prepared postgres
6. checkpoint
7. Stop Server

Thanks to Robert Haas for having discussion (offlist) about the idea
and suggestions to improve it and also Andres Freund for having
discussion and sharing thoughts about this idea at PGCon.

Suggestions?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
group_xid_clearing_at_trans_end_v1.patch application/octet-stream 8.8 KB
perf_data_procarraylock_group_clear_v1.png image/png 82.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2015-06-29 15:39:56 Re: Oh, this is embarrassing: init file logic is still broken
Previous Message Tom Lane 2015-06-29 14:42:50 Re: pg_file_settings view vs. Windows