On Mon, Jun 18, 2012 at 9:36 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Jun 16, 2012 at 6:25 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>> Well, this fell through the cracks, because I forgot to add it to the
>>> January CommitFest. Here it is again, rebased.
>> This applies and builds cleanly and passes make check (under enable-cassert).
>> Not test or docs are needed for a patch of this nature.
>> It does what it says, and we want it.
>> I wondered if the change in the return signature of s_lock would have
>> an affect on performance. So I've run a series of pgbench -T 30 -P
>> -c8 -j8, at a scale of 30 which fits in shared_buffers, using an
>> Amazon c1.xlarge
>> (8 cores). I ran both HEAD, and HEAD+patch (without LWLOCK_STATS in
>> both cases), in random ordering. The patch was 0.37% slower, average
>> 298483 selects per second patched to 299582 HEAD. The difference is
>> probably real (p value 0.042, one sided.) but is also pretty much
>> negligible and could just be due to where the executable code falls in
>> the cache lines which could move around with other changes to the
> I found this a bit surprising, so I retested on the IBM POWER7 box at
> concurrencies from 1 to 64 and found some test runs faster and some
> test runs slower - maybe a few more faster than slower. I could do a
> more exact analysis, but I'm inclined to think that if there is an
> effect here, it's probably just in the noise, and that the real effect
> you measured was, as you say, the result of cache-line shifting or
> some other uninteresting artifact that could just as easily move back
> the other way on some other commit. Really, s_lock should not be
> getting called often enough to matter much, and ignoring the return
> value of that function shouldn't cost anyone much.
>> Two suggestions:
>> In your original email you say "number of pg_usleep() calls that are
>> required to acquire each LWLock", but nothing in the code says this.
>> Just reading lwlock.c I would naively assume it is reporting the
>> number of TAS spins, not the number of spin-delays (and in fact that
>> is what I did assume until I read your email more carefully). A
>> comment somewhere in lwlock.c would be helpful.
> Yeah, or maybe I should just change the printout to say spindelay
> instead of spin, and rename the variables likewise.
>> Also in lwlock.c,
>> if (sh_acquire_counts[i] || ex_acquire_counts[i] ||
>> block_counts[i] || spin_counts[i])
>> I don't think we can have spins (or blocks, for that matter) unless we
>> have some acquires to have caused them, so the last two tests in that
>> line seem to be noise.
> Perhaps so, but I think it's probably safe and clear to just follow
> the existing code pattern.
>> Since my suggestions are minor, should I go ahead and mark this ready
>> for committer?
> If you think it should be committed, yes.
So Jeff did that, and I've now committed the patch.
Thanks for the review.
The Enterprise PostgreSQL Company
In response to
pgsql-hackers by date
|Next:||From: Robert Haas||Date: 2012-06-26 20:08:48|
|Subject: Re: [PATCH] Lazy hashaggregate when no aggregation is needed|
|Previous:||From: Josh Berkus||Date: 2012-06-26 19:49:59|
|Subject: Posix Shared Mem patch|