Re: Reducing contention for the LockMgrLock

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing contention for the LockMgrLock
Date: 2005-12-11 21:26:22
Message-ID: 19202.1134336382@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> So it seems it's time to start thinking about how to reduce contention
> for the LockMgrLock.
> ...
> The best idea I've come up with after a bit of thought is to replace the
> shared lock table with N independent tables representing partitions of the
> lock space.

I've committed changes along this line. Testing with pgbench on a dual
HT Xeon, I get numbers like this (for successive -c 10 -t 3000 runs
after an -s 10 initialization):

Previous CVS HEAD:
tps = 1561.983651 (including connections establishing)
tps = 1510.301236 (including connections establishing)
tps = 1496.679616 (including connections establishing)

With 4 partitions:
tps = 1671.311892 (including connections establishing)
tps = 1620.093917 (including connections establishing)
tps = 1598.887515 (including connections establishing)

With 16 partitions:
tps = 1689.662504 (including connections establishing)
tps = 1595.530388 (including connections establishing)
tps = 1609.552501 (including connections establishing)

CPU idle percentage according to "top" is around 5% for the previous
HEAD, and around 2% for either of the partition cases. I didn't see
any dropoff in CS rate however --- seemed to be around 35K in all cases.

The TPS rates for a single client are the same to within measurement
noise, so it seems we're not paying too much for the extra
LWLockAcquire/Release cycles during LockReleaseAll.

As you can see, there's not a lot of difference between the 4- and 16-
partition numbers; this is probably because the OIDs assigned in
pgbench's simplistic schema are such that the load is fairly evenly
distributed across partitions in both cases. We need to test some other
scenarios to see which size we should go with. (If you want to test,
change NUM_LOCK_PARTITIONS in src/include/storage/lock.h, and be sure
to recompile the whole backend because this affects the PGPROC struct.)

I spent some time looking at the lock acquire/conflict counts using the
same patch mentioned previously, and got some moderately interesting
numbers. A representative value of the per-process counts for the
single LockMgrLock was

PID 12972 lwlock LockMgrLock: shacq 0 exacq 50204 blk 3243

In the old code, there were 15 predictable LockMgrLock acquisitions per
pgbench transaction (for transaction and relation locks), or 45000 for
the whole run; the majority of the other 5K acquisitions seem to be for
RelationExtension locks, with a few hundred Tuple locks occurring due to
update contention on rows of the "branches" table.

With 4 lock partitions, a typical process shows

PID 20471 lwlock 20: shacq 0 exacq 8809 blk 115
PID 20471 lwlock 21: shacq 0 exacq 10933 blk 245
PID 20471 lwlock 22: shacq 0 exacq 20267 blk 503
PID 20471 lwlock 23: shacq 0 exacq 17148 blk 404
TOTAL 57157 1267

and with 16:

PID 13367 lwlock 20: shacq 0 exacq 679 blk 1
PID 13367 lwlock 21: shacq 0 exacq 648 blk 2
PID 13367 lwlock 22: shacq 0 exacq 665 blk 3
PID 13367 lwlock 23: shacq 0 exacq 12611 blk 262
PID 13367 lwlock 24: shacq 0 exacq 773 blk 3
PID 13367 lwlock 25: shacq 0 exacq 6715 blk 80
PID 13367 lwlock 26: shacq 0 exacq 781 blk 1
PID 13367 lwlock 27: shacq 0 exacq 6706 blk 89
PID 13367 lwlock 28: shacq 0 exacq 6507 blk 68
PID 13367 lwlock 29: shacq 0 exacq 731 blk 2
PID 13367 lwlock 30: shacq 0 exacq 9492 blk 170
PID 13367 lwlock 31: shacq 0 exacq 837 blk 3
PID 13367 lwlock 32: shacq 0 exacq 6530 blk 81
PID 13367 lwlock 33: shacq 0 exacq 717 blk 1
PID 13367 lwlock 34: shacq 0 exacq 6564 blk 74
PID 13367 lwlock 35: shacq 0 exacq 831 blk 0
TOTAL 61787 840

The increase in the total number of acquisitions happens because
LockReleaseAll needs to touch several partitions during each transaction
commit. There are seven relations in the test (4 tables, 3 indexes) and
you can clearly see which partitions their locks fell into during the
16-way test. (Transaction and tuple locks will be pretty evenly spread
across all the partitions, because those locktags change constantly.)

We are getting a reduction in contention, as shown by the falling number
of lock blockages, but we're paying for it with more lock acquisition
cycles.

Bottom line is that this seems to have been a useful improvement, but
it didn't get us as far as I'd hoped.

Any thoughts on other things to try?

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Anjan Kumar. A. 2005-12-11 22:06:01 Re: [HACKERS] Please Help: PostgreSQL Query Optimizer
Previous Message Tom Lane 2005-12-11 20:41:36 Re: [DOCS] [HACKERS] Please Help: PostgreSQL Query Optimizer