Re: our buffer replacement strategy is kind of lame

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: our buffer replacement strategy is kind of lame
Date: 2011-08-14 18:33:34
Message-ID: CA+TgmoZhXDEanouGJDTnsfhqrt7fe071VJTKxvR7qO=vjt76aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 14, 2011 at 1:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On Sat, Aug 13, 2011 at 11:14 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> I agree that something's missing.
>
>> I'm quoting you completely out of context here, but yes, something is missing.
>
>> We can't credibly do one test on usage count in shared buffers and
>> then start talking about how buffer management is all wrong.
>
> More generally: the originally presented facts suggest that there might
> be value in improving the "buffer access strategy" code that keeps
> particular operations from using all of shared_buffers.

Possibly. Since I've realized that we only switch to a ring buffer
when the size of the relation is more than 25% of shared buffers, I'm
less concerned about that problem. My test case only demonstrated a
~20% performance improvement from getting rid of the ring buffer, and
that was with a relation that happened to be 27% of the size of
shared_buffers, so it was a bit unlucky. I think it'd be interesting
to try to make this smarter in some way, but it's not bugging me as
much now that I've realized that I was unlucky to fall down that
particular well.

> It seems to me
> to be a giant and unsubstantiated leap from that to the conclusion that
> there's anything wrong with the clock sweep algorithm.  Moreover,
> several of the proposed "fixes" amount to reversion to methods that
> we already know are less good than the clock sweep, because we already
> tried them years ago.  So I've been quite unimpressed with the quality
> of discussion in this thread

Well, here's the problem I'm worried about: if 99% of shared_buffers
is filled with a very hot working set, every new page that gets
brought in will need to scan, on average, 100 buffers before finding
something to evict. That seems slow. Simon is proposing to bound the
really bad case where you flip through the entire ring multiple times
before you find a buffer, and that may well be worth doing. But I
think even scanning 100 buffers every time you need to bring something
in is too slow. What's indisputable is that a SELECT-only workload
which is larger than shared_buffers can be very easily rate-limited by
the speed at which BufFreelistLock can be taken and released. If you
have a better idea for solving that problem, I'm all ears...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2011-08-14 19:30:43 Re: WIP: Fast GiST index build
Previous Message Tom Lane 2011-08-14 18:21:56 VACUUM FULL versus unsafe order-of-operations in DDL commands