Re: Sun Donated a Sun Fire T2000 to the PostgreSQL

From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: Sun Donated a Sun Fire T2000 to the PostgreSQL
Date: 2006-07-21 07:56:56
Message-ID: 44C088C8.9050303@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Tom Lane wrote:

>Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp> writes:
>
>
>>>>18% in s_lock is definitely bad :-(. Were you able to determine which
>>>>LWLock(s) are accounting for the contention?
>>>>
>>>>
>
>
>
>>Sorry for the delay. Finally I got the oprofile data. It's
>>huge(34MB). If you are interested, I can put somewhere. Please let me
>>know.
>>
>>
>
>I finally got a chance to look at this, and it seems clear that all the
>traffic is on the BufMappingLock. This is essentially the same problem
>we were discussing with respect to Gavin Hamill's report of poor
>performance on an 8-way IBM PPC64 box (see hackers archives around
>2006-04-21). If your database is fully cached in shared buffers, then
>you can do a whole lot of buffer accesses per unit time, and even though
>all the BufMappingLock acquisitions are in shared-LWLock mode, the
>LWLock's spinlock ends up being heavily contended on an SMP box.
>
>It's likely that CVS HEAD would show somewhat better performance because
>of the btree change to cache local copies of index metapages (which
>eliminates a fair fraction of buffer accesses, at least in Gavin's test
>case). Getting much further than that seems to require partitioning
>the buffer mapping table. The last discussion stalled on my concerns
>about unpredictable shared memory usage, but I have some ideas on that
>which I'll post separately. In the meantime, thanks for sending along
>the oprofile data!
>
> regards, tom lane
>
>
I ran pgbench and fired up a DTrace script using the lwlock probes we've
added, and it looks like BufMappingLock is the most contended lock, but
CheckpointStartLocks are held for longer duration!

Lock Id Mode Count
ControlFileLock Exclusive 1
SubtransControlLock Exclusive 1
BgWriterCommLock Exclusive 6
FreeSpaceLock Exclusive 6
FirstLockMgrLock Exclusive 48
BufFreelistLock Exclusive 74
BufMappingLock Exclusive 74
CLogControlLock Exclusive 184
XidGenLock Exclusive 184
CheckpointStartLock Shared 185
WALWriteLock Exclusive 185
ProcArrayLock Exclusive 368
CLogControlLock Shared 552
SubtransControlLock Shared 1273
WALInsertLock Exclusive 1476
XidGenLock Shared 1842
ProcArrayLock Shared 3160
SInvalLock Shared 3684
BufMappingLock Shared 14578

Lock Id Combined Time (ns)
ControlFileLock 7915
BgWriterCommLock 43438
FreeSpaceLock 111139
BufFreelistLock 448530
FirstLockMgrLock 2879957
CLogControlLock 4237750
SubtransControlLock 6378042
XidGenLock 9500422
WALInsertLock 16372040
SInvalLock 23284554
ProcArrayLock 32188638
BufMappingLock 113128512
WALWriteLock 142391501
CheckpointStartLock 4171106665

Regards,
-Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Enver ALTIN 2006-07-21 08:11:17 Re: Transaction Speed
Previous Message Robert Lor 2006-07-21 07:40:53 Generic Monitoring Framework with DTrace patch

Browse pgsql-performance by date

  From Date Subject
Next Message robin.c.smith 2006-07-21 10:40:33 Forcing using index instead of sequential scan?
Previous Message Alaa El Gohary 2006-07-21 07:41:02 BUG #2543: Performance delay acrros the same day