Re: BufferAlloc: don't take two simultaneous locks

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, michail(dot)nikolaev(at)gmail(dot)com, x4mmm(at)yandex-team(dot)ru, andres(at)anarazel(dot)de, simon(dot)riggs(at)enterprisedb(dot)com
Subject: Re: BufferAlloc: don't take two simultaneous locks
Date: 2022-03-15 22:13:09
Message-ID: 3f993baebf7d5a5448a962fd3e663d707045f3f8.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В Вт, 15/03/2022 в 13:47 +0300, Yura Sokolov пишет:
> В Вт, 15/03/2022 в 16:25 +0900, Kyotaro Horiguchi пишет:
> > Thanks for the new version.
> >
> > At Tue, 15 Mar 2022 08:07:39 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> > > В Пн, 14/03/2022 в 14:57 +0300, Yura Sokolov пишет:
> > > > В Пн, 14/03/2022 в 17:12 +0900, Kyotaro Horiguchi пишет:
> > > > > At Mon, 14 Mar 2022 09:15:11 +0300, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote in
> > > > > > В Пн, 14/03/2022 в 14:31 +0900, Kyotaro Horiguchi пишет:
> > > > > I tried pgbench runs with scale 100 (with 10 threads, 10 clients) on
> > > > > 128kB shared buffers and I saw that get_hash_entry never takes the
> > > > > !element_alloc() path and always allocate a fresh entry, then
> > > > > saturates at 30 new elements allocated at the medium of a 100 seconds
> > > > > run.
> > > > >
> > > > > Then, I tried the same with the patch, and I am surprized to see that
> > > > > the rise of the number of newly allocated elements didn't stop and
> > > > > went up to 511 elements after the 100 seconds run. So I found that my
> > > > > concern was valid. The change in dynahash actually
> > > > > continuously/repeatedly causes lack of free list entries. I'm not
> > > > > sure how much the impact given on performance if we change
> > > > > get_hash_entry to prefer other freelists, though.
> > > >
> > > > Well, it is quite strange SharedBufHash is not allocated as
> > > > HASH_FIXED_SIZE. Could you check what happens with this flag set?
> > > > I'll try as well.
> > > >
> > > > Other way to reduce observed case is to remember freelist_idx for
> > > > reused entry. I didn't believe it matters much since entries migrated
> > > > netherless, but probably due to some hot buffers there are tention to
> > > > crowd particular freelist.
> > >
> > > Well, I did both. Everything looks ok.
> >
> > Hmm. v8 returns stashed element with original patition index when the
> > element is *not* reused. But what I saw in the previous test runs is
> > the REUSE->ENTER(reuse)(->REMOVE) case. So the new version looks like
> > behaving the same way (or somehow even worse) with the previous
> > version.
>
> v8 doesn't differ in REMOVE case neither from master nor from
> previous version. It differs in RETURNED case only.
> Or I didn't understand what you mean :(
>
> > get_hash_entry continuously suffer lack of freelist
> > entry. (FWIW, attached are the test-output diff for both master and
> > patched)
> >
> > master finally allocated 31 fresh elements for a 100s run.
> >
> > > ALLOCED: 31 ;; freshly allocated
> >
> > v8 finally borrowed 33620 times from another freelist and 0 freshly
> > allocated (ah, this version changes that..)
> > Finally v8 results in:
> >
> > > RETURNED: 50806 ;; returned stashed elements
> > > BORROWED: 33620 ;; borrowed from another freelist
> > > REUSED: 1812664 ;; stashed
> > > ASSIGNED: 1762377 ;; reused
> > > (ALLOCED: 0) ;; freshly allocated
> >
> > It contains a huge degradation by frequent elog's so they cannot be
> > naively relied on, but it should show what is happening sufficiently.
>
> Is there any measurable performance hit cause of borrowing?
> Looks like "borrowed" happened in 1.5% of time. And it is on 128kb
> shared buffers that is extremely small. (Or it was 128MB?)
>
> Well, I think some spare entries could reduce borrowing if there is
> a need. I'll test on 128MB with spare entries. If there is profit,
> I'll return some, but will keep SharedBufHash fixed.

Well, I added GetMaxBackends spare items, but I don't see certain
profit. It is probably a bit better at 128MB shared buffers and
probably a bit worse at 1GB shared buffers (select_only on scale 100).

But it is on old Xeon X5675. Probably things will change on more
capable hardware. I just don't have access at the moment.

>
> Master branch does less freelist manipulations since it tries to
> insert first and if there is collision it doesn't delete victim
> buffer.
>

-----

regards
Yura

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-03-15 22:44:39 Re: Estimating HugePages Requirements?
Previous Message Magnus Hagander 2022-03-15 22:02:37 Re: Estimating HugePages Requirements?