Re: Patch: fix lock contention for HASHHDR.mutex

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: Patch: fix lock contention for HASHHDR.mutex
Date: 2016-03-18 20:11:00
Message-ID: CA+TgmoZkg-04rcNRURt=jAG0Cs5oPyB-qKxH4wqX09e-oXy-nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 1, 2016 at 9:43 AM, Aleksander Alekseev
<a(dot)alekseev(at)postgrespro(dot)ru> wrote:
>> I am not sure, if this is exactly what has been suggested by Robert,
>> so it is not straightforward to see if his suggestion can allow us to
>> use NUM_FREELISTS as 8 rather than 32. I think instead of trying to
>> use FREELISTBUFF, why not do it as Robert has suggested and try with
>> different values of NUM_FREELISTS?
>
> Well, what I did is in fact a bit more general solution then Robert
> suggested. At first I just joined freeList and mutex arrays into one
> array and tried different NUM_FREELISTS values (see FREELISTS column).
> That was Robert's idea. Unfortunately it didn't give any considerable
> speedup comparing with previous approach.
>
> Then I tried to manually control sizeof every array item (see
> different SIZE values). By making it larger we can put every array item
> into a separate cache line. This approach helped a bit in terms of TPS
> (before: +33.9%, after: +34.2%) but only if NUM_FREELISTS remains the
> same (32).
>
> So answering your question - it turned out that we _can't_ reduce
> NUM_FREELISTS this way.

That's perplexing. I would have expected that with all of the mutexes
packed in back-to-back like this, we would end up with a considerable
amount of false sharing. I don't know why it ever helps to have an
array of bytes all in the same cache line of which each one is a
heavily-trafficked mutex. Anybody else have a theory?

One other thing that concerns me mildly is the way we're handling
nentries. It should be true, with this patch, that the individual
nentries sum to the right value modulo 2^32. But I don't think
there's any guarantee that the values are positive any more, and in
theory after running long enough one of them could manage to overflow
or underflow. So at a very minimum I think we need to remove the
Assert() the value is not negative. But really, I wonder if we
shouldn't rework things a little more than that.

One idea is to jigger things so that we maintain a count of the total
number of entries that doesn't change except when we allocate, and
then for each freelist partition we maintain the number of entries in
that freelist partition. So then the size of the hash table, instead
of being sum(nentries) is totalsize - sum(nfree).

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[ P.S. Sorry it's taken me a while to circle back around to this. ]

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-18 20:22:59 Re: [COMMITTERS] pgsql: Improve memory management for external sorts.
Previous Message Robert Haas 2016-03-18 19:53:16 Re: FIX : teach expression walker about RestrictInfo