Re: Patch: fix lock contention for HASHHDR.mutex

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: Patch: fix lock contention for HASHHDR.mutex
Date: 2016-03-19 04:28:43
Message-ID: CAA4eK1L-YXwDproVh8jv5z0Q8eJeuVST-zNC2XuJ=ctzSMOvsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 19, 2016 at 1:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Mar 1, 2016 at 9:43 AM, Aleksander Alekseev
> <a(dot)alekseev(at)postgrespro(dot)ru> wrote:
> >
> > So answering your question - it turned out that we _can't_ reduce
> > NUM_FREELISTS this way.
>
> That's perplexing. I would have expected that with all of the mutexes
> packed in back-to-back like this, we would end up with a considerable
> amount of false sharing. I don't know why it ever helps to have an
> array of bytes all in the same cache line of which each one is a
> heavily-trafficked mutex. Anybody else have a theory?
>
> One other thing that concerns me mildly is the way we're handling
> nentries. It should be true, with this patch, that the individual
> nentries sum to the right value modulo 2^32. But I don't think
> there's any guarantee that the values are positive any more, and in
> theory after running long enough one of them could manage to overflow
> or underflow.
>

Won't in theory, without patch as well nentries can overflow after running
for very long time? I think with patch it is more prone to overflow
because we start borrowing from other free lists as well.

So at a very minimum I think we need to remove the
> Assert() the value is not negative. But really, I wonder if we
> shouldn't rework things a little more than that.
>
> One idea is to jigger things so that we maintain a count of the total
> number of entries that doesn't change except when we allocate, and
> then for each freelist partition we maintain the number of entries in
> that freelist partition. So then the size of the hash table, instead
> of being sum(nentries) is totalsize - sum(nfree).
>

To me, your idea sounds much better than current code in terms of
understanding the free list concept as well. So, +1 for changing the code
in this way.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-03-19 04:59:01 logger process infinite loop
Previous Message David Rowley 2016-03-19 02:32:09 Re: Parallel Aggregate