RE: [Patch] Optimize dropping of relation buffers using dlist

From: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>
To: "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-15 11:11:26
Message-ID: OSBPR01MB2341126E2FDAF52547C5B0BBEF200@OSBPR01MB2341.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> BTW, I think I see one problem in the code:
> >
> > if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
> > + bufHdr->tag.forkNum == forkNum[j] && tag.blockNum >=
> > + bufHdr->firstDelBlock[j])
> >
> > Here, I think you need to use 'i' not 'j' for forkNum and
> > firstDelBlock as those are arrays w.r.t forks. That might fix the
> > problem but I am not sure as I haven't tried to reproduce it.
>
> Thanks for advice. Right, that seems to be the cause of error, and fixing that
> (using fork) solved the case.
> I also followed the advice of Tsunakawa-san of using more meaningful
> iterator Instead of using "i" & "j" for readability.
>
> I also added a new function when relation fork is bigger than the threshold
> If (nblocks > BUF_DROP_FULLSCAN_THRESHOLD)
> (DropRelFileNodeBuffersOfFork) Perhaps there's a better name for that
> function.
> However, as expected in the previous discussions, this is a bit slower than the
> standard buffer invalidation process, because the whole shared buffers are
> scanned nfork times.
> Currently, I set the threshold to (NBuffers / 500)

I made a mistake in the v12. I replaced the firstDelBlock[fork_num] with firstDelBlock[block_num],
In the for-loop code block of block_num, because we want to process the current block of per-block loop

OTOH, I used the firstDelBlock[fork_num] when relation fork is bigger than the threshold,
or if the cached blocks of small relations were already invalidated.

The logic could be either correct or wrong, so I'd appreciate feedback and comments/advice.

Regards,
Kirk Jamison

Attachment Content-Type Size
v13-Speedup-dropping-of-relation-buffers-during-recovery.patch application/octet-stream 9.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-09-15 11:21:31 Re: Use incremental sort paths for window functions
Previous Message David Rowley 2020-09-15 10:57:04 Re: [PATCH] Remove useless distinct clauses