From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: reducing the overhead of frequent table locks - now, with WIP patch |
Date: | 2011-06-05 21:46:32 |
Message-ID: | BANLkTimFkPJB_mL=b2noGXg55f1v7FObDw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jun 5, 2011 at 4:01 PM, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> On 06/05/2011 09:12 PM, Heikki Linnakangas wrote:
>> On 05.06.2011 22:04, Stefan Kaltenbrunner wrote:
>>> and one for the -j80 case(also patched).
>>>
>>>
>>> 485798 48.9667 postgres s_lock
>>> 60327 6.0808 postgres LWLockAcquire
>>> 57049 5.7503 postgres LWLockRelease
>>> 18357 1.8503 postgres hash_search_with_hash_value
>>> 17033 1.7169 postgres GetSnapshotData
>>> 14763 1.4881 postgres base_yyparse
>>> 14460 1.4575 postgres SearchCatCache
>>> 13975 1.4086 postgres AllocSetAlloc
>>> 6416 0.6467 postgres PinBuffer
>>> 5024 0.5064 postgres SIGetDataEntries
>>> 4704 0.4741 postgres core_yylex
>>> 4625 0.4662 postgres _bt_compare
>>
>> Hmm, does that mean that it's spending 50% of the time spinning on a
>> spinlock? That's bad. It's one thing to be contended on a lock, and have
>> a lot of idle time because of that, but it's even worse to spend a lot
>> of time spinning because that CPU time won't be spent on doing more
>> useful work, even if there is some other process on the system that
>> could make use of that CPU time.
>
> well yeah - we are broken right now with only being able to use ~20% of
> CPU on a modern mid-range box, but using 80% CPU (or 4x like in the
> above case) and only getting less than 2x the performance seems wrong as
> well. I also wonder if we are still missing something fundamental -
> because even with the current patch we are quite far away from linear
> scaling and light-years from some of our competitors...
Could you compile with LWLOCK_STATS, rerun these tests, total up the
"blk" numbers by LWLockId, and post the results? (Actually, totalling
up the shacq and exacq numbers would be useful as well, if you
wouldn't mind.)
Unless I very much miss my guess, we're going to see zero contention
on the new structures introduced by this patch. Rather, I suspect
what we're going to find is that, with the hideous contention on one
particular lock manager partition lock removed, there's a more
spread-out contention problem, likely involving the lock manager
partition lock, the buffer mapping locks, and possibly other LWLocks
as well. The fact that the system is busy-waiting rather than just
not using the CPU at all probably means that the remaining contention
is more spread out than that which is removed by this patch. We don't
actually have everything pile up on a single LWLock (as happens in git
master), but we do spend a lot of time fighting cache lines away from
other CPUs. Or at any rate, that's my guess: we need some real
numbers to know for sure.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Gurjeet Singh | 2011-06-06 00:16:00 | Re: Review: psql include file using relative path |
Previous Message | Josh Kupershmidt | 2011-06-05 20:36:57 | Re: patch: Allow \dd to show constraint comments |