Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
Date: 2018-07-12 00:19:29
Message-ID: CAJrrPGduEpxwtu9VFxT21DNK=WRP=LUjK4GjPfm+4+PCjpcAxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 12, 2018 at 8:32 AM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> >> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
> >> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show
> up
> >> >> as "reads" when in fact they are not reads at all:
> >> >>
> >> >> 1. Relation extension, which in fact writes a zero-filled block.
> >> >> 2. The RBM_ZERO_* modes, which provoke neither read nor write.
> >
> > I checked the patch and I agree with the change 1). And regarding change
> 2)
> > whether it is zeroing the contents of the page or not, it does read?
> > because if it exists in the buffer pool, we are counting them as hits
> > irrespective
> > of the mode? Am I missing something?
>
> Further down in the function you can see that there is no read()
> system call for the RBM_ZERO_* modes:
>
> if (mode == RBM_ZERO_AND_LOCK || mode ==
> RBM_ZERO_AND_CLEANUP_LOCK)
> MemSet((char *) bufBlock, 0, BLCKSZ);
> else
> {
> ...
> smgrread(smgr, forkNum, blockNum, (char *)
> bufBlock);
> ...
> }
>

Thanks for the details. I got your point. But we need to include
RBM_ZERO_ON_ERROR case read operations, excluding others
are fine.

> I suppose someone might argue that even when it's not a hit and it's
> not a read, we might still want to count this buffer interaction in
> some other way. Perhaps there should be a separate counter? It may
> technically be a kind of cache miss, but it's nowhere near as
> expensive as a synchronous system call like read() so I didn't propose
> that.
>

Yes, I agree that we may need a new counter that counts the buffers that
are just allocated (no read or no write). But currently, may be the counter
value is very less, so people are not interested.

> Some more on my motivation: In our zheap prototype, when the system
> is working well and we have enough space, we constantly allocate
> zeroed buffer pages at the insert point (= head) of an undo log and
> drop pages at the discard point (= tail) in the background;
> effectively a few pages just go round and round via the freelist and
> no read() or write() syscalls happen. That's something I'm very happy
> about and it's one of our claimed advantages over the traditional heap
> (which tends to read and dirty more pages), but EXPLAIN (BUFFERS)
> hides this virtuous behaviour when comparing with the traditional
> heap: it falsely and slanderously reports that zheap is reading undo
> pages when it is not. Of course I don't intent to litigate zheap
> design in this thread, I just I figured that since this accounting is
> wrong on principle and affects current PostgreSQL too (at least in
> theory) I would propose this little patch independently. It's subtle
> enough that I wouldn't bother to back-patch it though.
>

OK. May be it is better to implement the buffer allocate counter along with
zheap to provide better buffer results?

Regards,
Haribabu Kommi
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2018-07-12 00:20:05 RE: How can we submit code patches that implement our (pending) patents?
Previous Message Peter Geoghegan 2018-07-11 23:35:33 Re: Tips on committing