RE: [Patch] Optimize dropping of relation buffers using dlist

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: 'Kyotaro Horiguchi' <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "amit(dot)kapila16(at)gmail(dot)com" <amit(dot)kapila16(at)gmail(dot)com>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "tomas(dot)vondra(at)2ndquadrant(dot)com" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-29 04:04:16
Message-ID: OSBPR01MB2341EF13BC3FE9879F42B9FFEF320@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, September 29, 2020 10:35 AM, Horiguchi-san wrote:

> FWIW, I (and maybe Amit) am thinking that the property we need here is not it
> is cached or not but the accuracy of the returned file length, and that the
> "cached" property should be hidden behind the API.
>
> Another reason for not adding this function is the cached value is not really
> reliable on non-recovery environment.
>
> > So in the new function, it goes something like:
> > if (InRecovery)
> > {
> > if (reln->smgr_cached_nblocks[forknum] !=
> InvalidBlockNumber)
> > return reln->smgr_cached_nblocks[forknum];
> > else
> > return InvalidBlockNumber;
> > }
>
> If we add the new function, it should reutrn InvalidBlockNumber without
> consulting smgr_nblocks().

So here's how I revised it
smgrcachednblocks(SMgrRelation reln, ForkNumber forknum)
{
if (InRecovery)
{
if (reln->smgr_cached_nblocks[forknum] != InvalidBlockNumber)
return reln->smgr_cached_nblocks[forknum];
}
return InvalidBlockNumber;

> Hmm. The current loop in DropRelFileNodeBuffers looks like this:
>
> if (InRecovery)
> for (for each forks)
> if (the fork meets the criteria)
> <optimized dropping>
> else
> <full scan>
>
> I think this is somewhat different from the current discussion. Whether we
> sum-up the number of blcoks for all forks or just use that of the main fork, we
> should take full scan if we failed to know the accurate size for any one of the
> forks. (In other words, it is stupid that we run a full scan for more than one
> fork at a
> drop.)
>
> Come to think of that, we can naturally sum-up all forks' blocks since anyway
> we need to call smgrnblocks for all forks to know the optimzation is usable.

I understand. We really don't have to enter the optimization when we know the
file size is inaccurate. That also makes the patch simpler.

> So that block would be something like this:
>
> for (forks of the rel)
> /* the function returns InvalidBlockNumber if !InRecovery */
> if (smgrnblocks returned InvalidBlockNumber)
> total_blocks = InvalidBlockNumber;
> break;
> total_blocks += nbloks of this fork
>
> /* <we could rely on the fact that InvalidBlockNumber is zero> */
> if (total_blocks != InvalidBlockNumber && total_blocks < threshold)
> for (forks of the rel)
> for (blocks of the fork)
> <try dropping the buffer for the block>
> else
> <full scan dropping>

I followed this logic in the attached patch.
Thank you very much for the thoughtful reviews.

Performance measurement for large shared buffers to follow.

Best regards,
Kirk Jamison

Attachment Content-Type Size
v18-Optimize-DropRelFileNodeBuffers-during-recovery.patch application/octet-stream 10.7 KB
v1-Prevent-invalidating-blocks-in-smgrextend-during-recovery.patch application/octet-stream 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-29 05:30:50 Re: [patch] Concurrent table reindex per-index progress reporting
Previous Message Masahiro Ikeda 2020-09-29 02:51:13 Re: New statistics for tuning WAL buffer size