Re: Bgwriter LRU cleaning: we've been going at this all wrong

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bgwriter LRU cleaning: we've been going at this all wrong
Date: 2007-06-30 02:28:15
Message-ID: Pine.GSO.4.64.0706292145090.7521@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 29 Jun 2007, Jim Nasby wrote:

> On Jun 26, 2007, at 11:57 PM, Greg Smith wrote:
>> I have a complete set of working code that tracks buffer usage
>> statistics...
>
> Even if it's not used by bgwriter for self-tuning, having that information
> available would be very useful for anyone trying to hand-tune the system.

The stats information that's in pg_stat_bgwriter combined with an
occasional snapshot of the current pg_stat_buffercache (now with usage
counts!) is just as useful. Right before freeze, I made sure everything I
was using for hand-tuning in this area made it into one of those. Really
all I do is collect that data as I happen to be scanning the buffer cache
anyway.

The way I'm keeping track of things internally is more intrusive to
collect than something I'd like to be turned on by default just for
information, and exposing what it knows to user-space isn't done yet. I
was hoping to figure out a way to use it to help justify its overhead
before bothering to optimize and report on it. The only reason I
mentioned the code at all is because I didn't want anybody else to waste
time writing that particular routine when I already have something that
works for this purpose sitting around.

> Is this still a serious issue with LDC?

Part of the reason I'm bugged about this area is that the scenario I'm
bringing up--lots of dirty and high usage buffers in a pattern the BGW
isn't good at writing causing buffer pool allocations to be slow--has the
potential to get even worse with LDC. Right now, if you're in this
particular failure mode, you can be "saved" by the next checkpoint because
it is going to flush all the dirty buffers out as fast as possible and
then you get to start over with a fairly clean slate. Once that stops
happening, I've observed the potential to run into this sort of breakdown
increase.

> I share Greg Stark's concern that we're going to end up wasting a lot of
> writes.

I don't think the goal is to write buffers significantly faster than they
have to in order to support new allocations; the idea is just to stop from
ever scanning the same section more than once when it's not possible for
it to find new things to do there. Right now there are substantial wasted
CPU/locking resources if you try to tune the LRU writer up for a heavy
load (by doing things like like increasing the percentage), as it just
keeps scanning the same high-usage count buffers over and over. With the
LRU now running during LDC, my gut feeling is its efficiency is even more
important now than it used to be. If it's wasteful of resources, that's
now even going to impact checkpoints, where before the two never happened
at the same time.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Guido Barosio 2007-06-30 04:03:19 Re: [Pgbuildfarm-members] time to play ...
Previous Message Tom Lane 2007-06-29 21:00:16 Re: Configurable Additional Stats