Re: [Patch] Optimize dropping of relation buffers using dlist

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: k(dot)jamison(at)fujitsu(dot)com
Cc: amit(dot)kapila16(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-02 01:31:22
Message-ID: 20200902.103122.1956905512754123032.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Tue, 1 Sep 2020 13:02:28 +0000, "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> wrote in
> On Tuesday, August 18, 2020 3:05 PM (GMT+9), Amit Kapila wrote:
> > Today, again thinking about this point it occurred to me that during recovery
> > we can reliably find the relation size and after Thomas's recent commit
> > c5315f4f44 (Cache smgrnblocks() results in recovery), we might not need to
> > even incur the cost of lseek. Why don't we fix this first for 'recovery' (by
> > following something on the lines of what Andres suggested) and then later
> > once we have a generic mechanism for "caching the relation size" [1], we can
> > do it for non-recovery paths.
> > I think that will at least address the reported use case with some minimal
> > changes.
> >
> > [1] -
> > https://www.postgresql.org/message-id/CAEepm%3D3SSw-Ty1DFcK%3D1r
> > U-K6GSzYzfdD4d%2BZwapdN7dTa6%3DnQ%40mail.gmail.com

Isn't a relation always locked asscess-exclusively, at truncation
time? If so, isn't even the result of lseek reliable enough? And if
we don't care the cost of lseek, we can do the same optimization also
for non-recovery paths. Since anyway we perform actual file-truncation
just after so I think the cost of lseek is negligible here.

> Attached is an updated V9 version with minimal code changes only and
> avoids the previous overhead in the BufferAlloc. This time, I only updated
> the recovery path as suggested by Amit, and followed Andres' suggestion
> of referring to the cached blocks in smgrnblocks.
> The layering is kinda tricky so the logic may be wrong. But as of now,
> it passes the regression tests. I'll follow up with the performance results.
> It seems there's regression for smaller shared_buffers. I'll update if I find bugs.
> But I'd also appreciate your reviews in case I missed something.

BUF_DROP_THRESHOLD seems to be misued. IIUC it defines the maximum
number of file pages that we make relation-targetted search for
buffers. Otherwise we scan through all buffers. On the other hand the
latest patch just leaves all buffers for relation forks longer than
the threshold.

I think we should determine whether to do targetted-scan or full-scan
based on the ratio of (expectedly maximum) total number of pages for
all (specified) forks in a relation against total number of buffers.

By the way

> #define BUF_DROP_THRESHOLD 500 /* NBuffers divided by 2 */

NBuffers is not a constant. Even if we wanted to set the macro as
described in the comment, we should have used (NBuffers/2) instead of
"500". But I suppose you might wanted to use (NBuffders / 500) as Tom
suggested upthread. And the name of the macro seems too generic. I
think more specific names like BUF_DROP_FULLSCAN_THRESHOLD would be
better.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-09-02 01:36:13 Re: [Patch] Optimize dropping of relation buffers using dlist
Previous Message Alvaro Herrera 2020-09-02 01:29:28 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly