Skip site navigation (1) Skip section navigation (2)

Re: PATCH: regular logging of checkpoint progress

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Noah Misch <noah(at)2ndQuadrant(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-08-27 07:39:14
Message-ID: 4E589F22.1020408@2ndQuadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 08/27/2011 12:01 AM, Noah Misch wrote:
> On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
>    
>> 1. collect pg_stat_bgwriter stats
>> 2. run pgbench for 10 minutes
>> 3. collect pg_stat_bgwriter stats (to compute difference with (1))
>> 4. kill the postmaster
>>
>> The problem is that when checkpoint stats are collected, there might be a
>> checkpoint in progress and in that case the stats are incomplete. In some
>> cases (especially with very small db blocks) this has significant impact
>> because the checkpoints are less frequent.
>>      
> Could you remove this hazard by adding a step "2a. psql -c CHECKPOINT"?
>    

That's what I do in pgbench-tools, and it helps a lot.  It makes it 
easier to identify when the checkpoint kicks in if you know it's 
approximately the same time after each test run begins, given similar 
testing parameters.  That said, it's hard to eliminate all of the edge 
conditions here.

For example, imagine that you're consuming WAL files such that you hit 
checkpoint_segments every 4 minutes.  In a 10 minute test run, a 
checkpoint will start at 4:00 and finish at around 6:00 (with 
checkpoint_completion_target=0.5).  The next will start at 8:00 and 
should finish at around 10:00--right at the end of when the test ends.  
Given the variation that sync timing and rounding issues in the write 
phase adds to things, you can expect that some test runs will include 
stats from 2 checkpoints, while others will end the test just before the 
second one finishes.  It does throw the numbers off a bit.

To avoid this when it pops up, I normally aim to push up to where there 
are >=4 checkpoints per test run, just so whether I get n or n-1 of them 
doesn't impact results as much.  But that normally takes doubling the 
length of the test to 20 minutes.  As it will often take me days of test 
time to plow through exploring just a couple of parameters, I'm 
sympathetic to Tomas trying to improve accuracy here without having to 
run for quite so long.  There's few people who have this problem to 
worry about though, it's a common issue with benchmarking but not many 
other contexts.

-- 
Greg Smith   2ndQuadrant US    greg(at)2ndQuadrant(dot)com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


In response to

pgsql-hackers by date

Next:From: Jesper KroghDate: 2011-08-27 07:52:05
Subject: Re: tsvector concatenation - backend crash
Previous:From: Gokulakannan SomasundaramDate: 2011-08-27 05:38:45
Subject: Re: cheaper snapshots redux

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group