Re: Log levels for checkpoint/bgwriter monitoring

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log levels for checkpoint/bgwriter monitoring
Date: 2007-03-09 05:51:11
Message-ID: Pine.GSO.4.64.0703082321510.14630@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 9 Mar 2007, ITAGAKI Takahiro wrote:

> In my understanding, each backend pins two or so buffers at once. So
> percentage of pinned buffers should be low.

With the pgbench workload, a substantial percentage of the buffer cache
ends up pinned. From staring at the buffer cache using
contrib/pg_buffercache, I believe most of that consists of the index
blocks for the records being updated in the accounts table.

I just posted a new version of the patch I asked for feedback on at the
beginning of this thread, the latest one is at
http://westnet.com/~gsmith/content/postgresql/new-patch-checkpoint.txt
I've been adjusting it to monitor the same data I think you need to refine
your patch. I believe the approach you're taking includes some
assumptions that seem perfectly reasonable, but that my testing doesn't
agree with. There's nothing like measuring something to settle what's
really going on, though, so that's what I've been focusing on. I'd love
to get some feedback on whether other people can replicate the kind of
things I'm seeing.

The new code generates statistics about exactly what the background writer
scan found during each round. If there's been substantial write activity,
it prints a log line when it recycles back to the beginning of the all
scan, to help characterize what the buffer pool looked like during that
scan from the perspective of the bgwriter. Here's some sample log output
from my underpowered laptop while running pgbench:

bgwriter scan all writes=16.6 MB (69.3%) pinned=11.7 MB (48.8%) LRU=7.7 MB (31.9%)
...
checkpoint required (wrote checkpoint_segments)
checkpoint buffers dirty=19.4 MB (80.8%) write=188.9 ms sync=4918.1 ms

Here 69% of the buffer cache contained dirty data, and 49% of the cache
was both pinned and dirty. During that same time period, the LRU write
also wrote out a fair amount of data, operating on the 20% of the cache
that was dirty but not pinned. On my production server, where the
background writer is turned way up to reduce checkpoint times, these
numbers are even more extreme; almost everything that's dirty is also
pinned during pgbench, and the LRU is lucky to find anything it can write
as a result.

That patch is against the 8.2 codebase; now that I'm almost done I'm
planning to move it to HEAD instead soon (where it will conflict
considerably with your patch). If you have an 8.2 configuration you can
test with my patch applied, set log_min_messages = debug2, try it out, and
see what you get when running pgbench for a while. I think you'll
discover some interesting and unexpected things.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2007-03-09 05:51:41 Re: who gets paid for this
Previous Message Tom Lane 2007-03-09 05:42:41 Re: RFC: changing autovacuum_naptime semantics