Re: BufferAlloc: don't take two simultaneous locks

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, michail(dot)nikolaev(at)gmail(dot)com, x4mmm(at)yandex-team(dot)ru, andres(at)anarazel(dot)de, simon(dot)riggs(at)enterprisedb(dot)com
Subject: Re: BufferAlloc: don't take two simultaneous locks
Date: 2022-03-14 06:15:11
Message-ID: 75fe5128f2550bf0d198ed222f1abd6f921aec77.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В Пн, 14/03/2022 в 14:31 +0900, Kyotaro Horiguchi пишет:
> At Mon, 14 Mar 2022 09:39:48 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > I'll examine the possibility to resolve this...
>
> The existence of nfree and nalloc made me confused and I found the
> reason.
>
> In the case where a parittion collects many REUSE-ASSIGN-REMOVEed
> elemetns from other paritiotns, nfree gets larger than nalloced. This
> is a strange point of the two counters. nalloced is only referred to
> as (sum(nalloced[])). So we don't need nalloced per-partition basis
> and the formula to calculate the number of used elements would be as
> follows.
>
> sum(nalloced - nfree)
> = <total_nalloced> - sum(nfree)
>
> We rarely create fresh elements in shared hashes so I don't think
> there's additional contention on the <total_nalloced> even if it were
> a global atomic.
>
> So, the remaining issue is the possible imbalancement among
> partitions. On second thought, by the current way, if there's a bad
> deviation in partition-usage, a heavily hit partition finally collects
> elements via get_hash_entry(). By the patch's way, similar thing
> happens via the REUSE-ASSIGN-REMOVE sequence. But buffers once used
> for something won't be freed until buffer invalidation. But bulk
> buffer invalidation won't deviatedly distribute freed buffers among
> partitions. So I conclude for now that is a non-issue.
>
> So my opinion on the counters is:
>
> I'd like to ask you to remove nalloced from partitions then add a
> global atomic for the same use?

I really believe it should be global. I made it per-partition to
not overcomplicate first versions. Glad you tell it.

I thought to protect it with freeList[0].mutex, but probably atomic
is better idea here. But which atomic to chose: uint64 or uint32?
Based on sizeof(long)?
Ok, I'll do in next version.

Whole get_hash_entry look strange.
Doesn't it better to cycle through partitions and only then go to
get_hash_entry?
May be there should be bitmap for non-empty free lists? 32bit for
32 partitions. But wouldn't bitmap became contention point itself?

> No need to do something for the possible deviation issue.

-------

regards
Yura Sokolov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-03-14 06:30:26 Re: [PATCH] Accept IP addresses in server certificate SANs
Previous Message Kyotaro Horiguchi 2022-03-14 05:31:12 Re: BufferAlloc: don't take two simultaneous locks