Quick Links

RE: [Patch] Optimize dropping of relation buffers using dlist

From:	"k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To:	'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc:	"tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	RE: [Patch] Optimize dropping of relation buffers using dlist
Date:	2020-09-29 04:04:16
Message-ID:	OSBPR01MB2341EF13BC3FE9879F42B9FFEF320@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tuesday, September 29, 2020 10:35 AM, Horiguchi-san wrote:

> FWIW, I (and maybe Amit) am thinking that the property we need here is not it
> is cached or not but the accuracy of the returned file length, and that the
> "cached" property should be hidden behind the API.
>
> Another reason for not adding this function is the cached value is not really
> reliable on non-recovery environment.
>
> > So in the new function, it goes something like:
> > if (InRecovery)
> > {
> > if (reln->smgr_cached_nblocks[forknum] !=
> InvalidBlockNumber)
> > return reln->smgr_cached_nblocks[forknum];
> > else
> > return InvalidBlockNumber;
> > }
>
> If we add the new function, it should reutrn InvalidBlockNumber without
> consulting smgr_nblocks().

So here's how I revised it
smgrcachednblocks(SMgrRelation reln, ForkNumber forknum)
{
if (InRecovery)
{
if (reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
return reln->smgr_cached_nblocks[forknum];
}
return InvalidBlockNumber;

> Hmm. The current loop in DropRelFileNodeBuffers looks like this:
>
> if (InRecovery)
> for (for each forks)
> if (the fork meets the criteria)
> <optimized dropping>
> else
> <full scan>
>
> I think this is somewhat different from the current discussion. Whether we
> sum-up the number of blcoks for all forks or just use that of the main fork, we
> should take full scan if we failed to know the accurate size for any one of the
> forks. (In other words, it is stupid that we run a full scan for more than one
> fork at a
> drop.)
>
> Come to think of that, we can naturally sum-up all forks' blocks since anyway
> we need to call smgrnblocks for all forks to know the optimzation is usable.

I understand. We really don't have to enter the optimization when we know the
file size is inaccurate. That also makes the patch simpler.

> So that block would be something like this:
>
> for (forks of the rel)
> /* the function returns InvalidBlockNumber if !InRecovery */
> if (smgrnblocks returned InvalidBlockNumber)
> total_blocks = InvalidBlockNumber;
> break;
> total_blocks += nbloks of this fork
>
> /* <we could rely on the fact that InvalidBlockNumber is zero> */
> if (total_blocks != InvalidBlockNumber && total_blocks < threshold)
> for (forks of the rel)
> for (blocks of the fork)
> <try dropping the buffer for the block>
> else
> <full scan dropping>

I followed this logic in the attached patch.
Thank you very much for the thoughtful reviews.

Performance measurement for large shared buffers to follow.

Best regards,
Kirk Jamison

Attachment	Content-Type	Size
v18-Optimize-DropRelFileNodeBuffers-during-recovery.patch	application/octet-stream	10.7 KB
v1-Prevent-invalidating-blocks-in-smgrextend-during-recovery.patch	application/octet-stream	1.1 KB

In response to

Re: [Patch] Optimize dropping of relation buffers using dlist at 2020-09-29 01:34:39 from Kyotaro Horiguchi

Responses

RE: [Patch] Optimize dropping of relation buffers using dlist at 2020-10-01 01:55:10 from k.jamison@fujitsu.com

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2020-09-29 05:30:50	Re: [patch] Concurrent table reindex per-index progress reporting
Previous Message	Masahiro Ikeda	2020-09-29 02:51:13	Re: New statistics for tuning WAL buffer size