RE: [Patch] Optimize dropping of relation buffers using dlist

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-23 07:57:33
Message-ID: OSBPR01MB2341D8BD41D43BA89C65E69BEF380@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, September 23, 2020 2:37 PM, Tsunakawa, Takayuki wrote:
> > I revised the patch based from my understanding of Horiguchi-san's
> > comment, but I could be wrong.
> > Quoting:
> >
> > "
> > + /* Get the number of blocks for the supplied
> relation's
> > fork */
> > + nblocks = smgrnblocks(smgr_reln,
> > forkNum[fork_num]);
> > + Assert(BlockNumberIsValid(nblocks));
> > +
> > + if (nblocks <
> BUF_DROP_FULLSCAN_THRESHOLD)
> >
> > As mentioned upthread, the criteria whether we do full-scan or
> > lookup-drop is how large portion of NBUFFERS this relation-drop can be
> > going to invalidate. So the nblocks above should be the sum of number
> > of blocks to be truncated (not just the total number of blocks) of all
> > designated forks. Then once we decided to do lookup-drop method, we
> > do that for all forks."
>
> One takeaway from Horiguchi-san's comment is to use the number of blocks
> to invalidate for comparison, instead of all blocks in the fork. That is, use
>
> nblocks = smgrnblocks(fork) - firstDelBlock[fork];
>
> Does this make sense?

Hmm. Ok, I think it got too much to my head that I misunderstood what it meant.
I'll debug again by using ereport just to check the values and behavior are correct.
Your comment about V14 patch has dawned on me that it reverted to previous
slower version where we scan NBuffers for each fork. Thank you for explaining it.

> What do you think is the reason for summing up all forks? I didn't
> understand why. Typically, FSM and VM forks are very small. If the main
> fork is larger than NBuffers / 500, then v14 scans the entire shared buffers for
> the FSM and VM forks as well as the main fork, resulting in three scans in
> total.
>
> Also, if you want to judge the criteria based on the total blocks of all forks, the
> following if should be placed outside the for loop, right? Because this if
> condition doesn't change inside the for loop.
>
> + if ((nblocks / (uint32)NBuffers) <
> BUF_DROP_FULLSCAN_THRESHOLD &&
> + BlockNumberIsValid(nblocks))
> + {
>
>
>
> > > (2)
> > > + if ((nblocks / (uint32)NBuffers) <
> > > BUF_DROP_FULLSCAN_THRESHOLD &&
> > > + BlockNumberIsValid(nblocks))
> > > + {
> > >
> > > The division by NBuffers is not necessary, because both sides of =
> > > are number of blocks.
> >
> > Again I based it from my understanding of the comment above, so
> > nblocks is the sum of all blocks to be truncated for all forks.
>
> But the left expression of "<" is a percentage, while the right one is a block
> count. Two different units are compared.
>

Right. Makes sense. Fixed.

> > > Why is BlockNumberIsValid(nblocks)) call needed?
> >
> > I thought we need to ensure that nblocks is not invalid, so I also
> > added
>
> When is it invalid? smgrnblocks() seems to always return a valid block
> number. Am I seeing a different source code (I saw HEAD)?

It's based from the discussion upthread to guarantee the cache to be valid while recovery
and that we don't want to proceed with the optimization in case that nblocks is invalid.
It may not be needed so I already removed it, because the correct direction is ensuring that
smgrnblocks return the precise value.
Considering the test case that Horiguchi-san suggested (attached as separate patch),
then maybe there's no need to indicate it in the loop condition.
For now, I haven't modified the design (or created a new function) of smgrnblocks,
and I just updated the patches based from the recent comments.

Thank you very much again for the reviews.

Best regards,
Kirk Jamison

Attachment Content-Type Size
v15-Speedup-dropping-of-relation-buffers-during-recovery.patch application/octet-stream 7.4 KB
v1-Prevent-smgrextend-from-invalidating-blocks-during-recovery.patch application/octet-stream 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-23 07:57:44 Re: Memory allocation abstraction in pgcrypto
Previous Message Keisuke Kuroda 2020-09-23 07:39:15 Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables