Skip site navigation (1) Skip section navigation (2)

Re: PATCH: regular logging of checkpoint progress

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-26 07:35:05
Message-ID: 4E574CA9.8090809@2ndQuadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 08/25/2011 04:57 PM, Tomas Vondra wrote:
> (b) sends bgwriter stats (so that the buffers_checkpoint is updated)
>    

The idea behind only updating the stats in one chunk, at the end, is 
that it makes one specific thing easier to do.  Let's say you're running 
a monitoring system that is grabbing snapshots of pg_stat_bgwriter 
periodically.  If you want to figure out how much work a checkpoint did, 
you only need two points of data to compute that right now.  Whenever 
you see either of the checkpoint count numbers increase, you just 
subtract off the previous sample; now you've got a delta for how many 
buffers that checkpoint wrote out.  You can derive the information about 
the buffer counts involved that appears in the logs quite easily this 
way.  The intent was to make that possible to do, so that people can 
figure this out without needing to parse the log data.

Spreading out the updates defeats that idea.  It also makes it possible 
to see the buffer writes more in real-time, as they happen.  You can 
make a case for both approaches having their use cases; the above is 
just summarizing the logic behind why it's done the way it is right 
now.  I don't think many people are actually doing things with this to 
the level where their tool will care.  The most popular consumer of 
pg_stat_bgwriter data I see is Munin graphing changes, and I don't think 
it will care either way.

Giving people the option of doing it the other way is a reasonable idea, 
but I'm not sure there's enough use case there to justify adding a GUC 
just for that.  My next goal here is to eliminate checkpoint_segments, 
not to add yet another tunable extremely few users would ever touch.

As for throwing more log data out, I'm not sure what new analysis you're 
thinking of that it allows.  I/O gets increasingly spiky as you zoom in 
on it; averaging over a shorter period can easily end up providing less 
insight about trends.  If anything, I spend more time summarizing the 
data that's already there, rather than wanting to break them down.  It's 
already providing way too much detail for most people.  Customers tell 
me they don't care to see checkpoint stats unless they're across a day 
or more of sampling, so even the current "once every ~5 minutes" is way 
more info than they want.  I have all this log parsing code and things 
that look at pg_stat_bgwriter to collect that data and produce higher 
level reports.  And lots of it would break if any of this patch is added 
and people turn it on.  I imagine other log/stat parsing programs might 
suffer issues too.  That's your other hurdle for change here:  the new 
analysis techniques have to be useful enough to justify that some 
downstream tool disruption is inevitable.

If you have an idea for how to use this extra data for something useful, 
let's talk about what that is and see if it's possible to build it in 
instead.  This problem is harder than it looks, mainly because the way 
the OS caches writes here makes trying to derive hard numbers from what 
the background writer is doing impossible.  When the database writes 
things out, and when they actually get written to disk, they are not the 
same event.  The actual write is often during the sync phase, and not 
being able to tracking that beast is where I see the most problems at.  
The write phase, the easier part to instrument in the database, that is 
pretty boring.  That's why the last extra logging I added here focused 
on adding visibility to the sync activity instead.

-- 
Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


In response to

Responses

pgsql-hackers by date

Next:From: Magnus HaganderDate: 2011-08-26 07:54:19
Subject: Re: PATCH: regular logging of checkpoint progress
Previous:From: Christian UllrichDate: 2011-08-26 06:47:43
Subject: Re: Removal of useless include references

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group