Re: BufferAlloc: don't take two simultaneous locks

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, michail(dot)nikolaev(at)gmail(dot)com, x4mmm(at)yandex-team(dot)ru, andres(at)anarazel(dot)de, simon(dot)riggs(at)enterprisedb(dot)com
Subject: Re: BufferAlloc: don't take two simultaneous locks
Date: 2022-03-15 10:47:17
Message-ID: 8da5a397ef639f0ffc2e46c0b8312ee25fb09c58.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В Вт, 15/03/2022 в 16:25 +0900, Kyotaro Horiguchi пишет:
> Thanks for the new version.
>
> At Tue, 15 Mar 2022 08:07:39 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> > В Пн, 14/03/2022 в 14:57 +0300, Yura Sokolov пишет:
> > > В Пн, 14/03/2022 в 17:12 +0900, Kyotaro Horiguchi пишет:
> > > > At Mon, 14 Mar 2022 09:15:11 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> > > > > В Пн, 14/03/2022 в 14:31 +0900, Kyotaro Horiguchi пишет:
> > > > I tried pgbench runs with scale 100 (with 10 threads, 10 clients) on
> > > > 128kB shared buffers and I saw that get_hash_entry never takes the
> > > > !element_alloc() path and always allocate a fresh entry, then
> > > > saturates at 30 new elements allocated at the medium of a 100 seconds
> > > > run.
> > > >
> > > > Then, I tried the same with the patch, and I am surprized to see that
> > > > the rise of the number of newly allocated elements didn't stop and
> > > > went up to 511 elements after the 100 seconds run. So I found that my
> > > > concern was valid. The change in dynahash actually
> > > > continuously/repeatedly causes lack of free list entries. I'm not
> > > > sure how much the impact given on performance if we change
> > > > get_hash_entry to prefer other freelists, though.
> > >
> > > Well, it is quite strange SharedBufHash is not allocated as
> > > HASH_FIXED_SIZE. Could you check what happens with this flag set?
> > > I'll try as well.
> > >
> > > Other way to reduce observed case is to remember freelist_idx for
> > > reused entry. I didn't believe it matters much since entries migrated
> > > netherless, but probably due to some hot buffers there are tention to
> > > crowd particular freelist.
> >
> > Well, I did both. Everything looks ok.
>
> Hmm. v8 returns stashed element with original patition index when the
> element is *not* reused. But what I saw in the previous test runs is
> the REUSE->ENTER(reuse)(->REMOVE) case. So the new version looks like
> behaving the same way (or somehow even worse) with the previous
> version.

v8 doesn't differ in REMOVE case neither from master nor from
previous version. It differs in RETURNED case only.
Or I didn't understand what you mean :(

> get_hash_entry continuously suffer lack of freelist
> entry. (FWIW, attached are the test-output diff for both master and
> patched)
>
> master finally allocated 31 fresh elements for a 100s run.
>
> > ALLOCED: 31 ;; freshly allocated
>
> v8 finally borrowed 33620 times from another freelist and 0 freshly
> allocated (ah, this version changes that..)
> Finally v8 results in:
>
> > RETURNED: 50806 ;; returned stashed elements
> > BORROWED: 33620 ;; borrowed from another freelist
> > REUSED: 1812664 ;; stashed
> > ASSIGNED: 1762377 ;; reused
> >(ALLOCED: 0) ;; freshly allocated
>
> It contains a huge degradation by frequent elog's so they cannot be
> naively relied on, but it should show what is happening sufficiently.

Is there any measurable performance hit cause of borrowing?
Looks like "borrowed" happened in 1.5% of time. And it is on 128kb
shared buffers that is extremely small. (Or it was 128MB?)

Well, I think some spare entries could reduce borrowing if there is
a need. I'll test on 128MB with spare entries. If there is profit,
I'll return some, but will keep SharedBufHash fixed.

Master branch does less freelist manipulations since it tries to
insert first and if there is collision it doesn't delete victim
buffer.

> > I lost access to Xeon 8354H, so returned to old Xeon X5675.
> ...
> > Strange thing: both master and patched version has higher
> > peak tps at X5676 at medium connections (17 or 27 clients)
> > than in first october version [1]. But lower tps at higher
> > connections number (>= 191 clients).
> > I'll try to bisect on master this unfortunate change.
>
> The reversing of the preference order between freshly-allocation and
> borrow-from-another-freelist might affect.

`master` changed its behaviour as well.
It is not problem of the patch at all.

------

regards
Yura.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-03-15 11:56:46 Re: Assert in pageinspect with NULL pages
Previous Message Jeevan Ladhe 2022-03-15 10:33:05 Re: refactoring basebackup.c