Re: BufferAlloc: don't take two simultaneous locks

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: y(dot)sokolov(at)postgrespro(dot)ru
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, michail(dot)nikolaev(at)gmail(dot)com, x4mmm(at)yandex-team(dot)ru, andres(at)anarazel(dot)de, simon(dot)riggs(at)enterprisedb(dot)com
Subject: Re: BufferAlloc: don't take two simultaneous locks
Date: 2022-03-15 07:25:45
Message-ID: 20220315.162545.2224728499612456194.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the new version.

At Tue, 15 Mar 2022 08:07:39 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> В Пн, 14/03/2022 в 14:57 +0300, Yura Sokolov пишет:
> > В Пн, 14/03/2022 в 17:12 +0900, Kyotaro Horiguchi пишет:
> > > At Mon, 14 Mar 2022 09:15:11 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> > > > В Пн, 14/03/2022 в 14:31 +0900, Kyotaro Horiguchi пишет:
> > > I tried pgbench runs with scale 100 (with 10 threads, 10 clients) on
> > > 128kB shared buffers and I saw that get_hash_entry never takes the
> > > !element_alloc() path and always allocate a fresh entry, then
> > > saturates at 30 new elements allocated at the medium of a 100 seconds
> > > run.
> > >
> > > Then, I tried the same with the patch, and I am surprized to see that
> > > the rise of the number of newly allocated elements didn't stop and
> > > went up to 511 elements after the 100 seconds run. So I found that my
> > > concern was valid. The change in dynahash actually
> > > continuously/repeatedly causes lack of free list entries. I'm not
> > > sure how much the impact given on performance if we change
> > > get_hash_entry to prefer other freelists, though.
> >
> > Well, it is quite strange SharedBufHash is not allocated as
> > HASH_FIXED_SIZE. Could you check what happens with this flag set?
> > I'll try as well.
> >
> > Other way to reduce observed case is to remember freelist_idx for
> > reused entry. I didn't believe it matters much since entries migrated
> > netherless, but probably due to some hot buffers there are tention to
> > crowd particular freelist.
>
> Well, I did both. Everything looks ok.

Hmm. v8 returns stashed element with original patition index when the
element is *not* reused. But what I saw in the previous test runs is
the REUSE->ENTER(reuse)(->REMOVE) case. So the new version looks like
behaving the same way (or somehow even worse) with the previous
version. get_hash_entry continuously suffer lack of freelist
entry. (FWIW, attached are the test-output diff for both master and
patched)

master finally allocated 31 fresh elements for a 100s run.

> ALLOCED: 31 ;; freshly allocated

v8 finally borrowed 33620 times from another freelist and 0 freshly
allocated (ah, this version changes that..)
Finally v8 results in:

> RETURNED: 50806 ;; returned stashed elements
> BORROWED: 33620 ;; borrowed from another freelist
> REUSED: 1812664 ;; stashed
> ASSIGNED: 1762377 ;; reused
>(ALLOCED: 0) ;; freshly allocated

It contains a huge degradation by frequent elog's so they cannot be
naively relied on, but it should show what is happening sufficiently.

> I lost access to Xeon 8354H, so returned to old Xeon X5675.
...
> Strange thing: both master and patched version has higher
> peak tps at X5676 at medium connections (17 or 27 clients)
> than in first october version [1]. But lower tps at higher
> connections number (>= 191 clients).
> I'll try to bisect on master this unfortunate change.

The reversing of the preference order between freshly-allocation and
borrow-from-another-freelist might affect.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-03-15 07:38:12 Re: Allow async standbys wait for sync replication
Previous Message Prabhat Sahu 2022-03-15 07:24:58 Can we consider "24 Hours" for "next day" in INTERVAL datatype ?