Re: Log levels for checkpoint/bgwriter monitoring

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Log levels for checkpoint/bgwriter monitoring
Date: 2007-03-27 15:57:23
Message-ID: 200703271557.l2RFvNl28519@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Is this patch ready?

---------------------------------------------------------------------------

Greg Smith wrote:
> I have a WIP patch that adds the main detail I have found I need to
> properly tune checkpoint and background writer activity. I think it's
> almost ready to submit (you can see the current patch against 8.2 at
> http://www.westnet.com/~gsmith/content/postgresql/patch-checkpoint.txt )
> after making it a bit more human-readable. But I've realized that along
> with that, I need some guidance in regards to what log level is
> appropriate for this information.
>
> An example works better than explaining what the patch does:
>
> 2007-02-19 21:53:24.602 EST - DEBUG: checkpoint required (wrote
> checkpoint_segments)
> 2007-02-19 21:53:24.685 EST - DEBUG: checkpoint starting
> 2007-02-19 21:53:24.705 EST - DEBUG: checkpoint flushing buffer pool
> 2007-02-19 21:53:24.985 EST - DEBUG: checkpoint database fsync starting
> 2007-02-19 21:53:42.725 EST - DEBUG: checkpoint database fsync complete
> 2007-02-19 21:53:42.726 EST - DEBUG: checkpoint buffer flush dirty=8034
> write=279956 us sync=17739974 us
>
> Remember that "Load distributed checkpoint" discussion back in December?
> You can see exactly how bad the problem is on your system with this log
> style (this is from a pgbench run where it's postively awful--it really
> does take over 17 seconds for the fsync to execute, and there are clients
> that are hung the whole time waiting for it).
>
> I also instrumented the background writer. You get messages like this:
>
> 2007-02-19 21:58:54.328 EST - DEBUG: BGWriter Scan All - Written = 5/5
> Unscanned = 23/54
>
> This shows that we wrote (5) the maximum pages we were allowed to write
> (5) while failing to scan almost half (23) of the buffers we meant to look
> at (54). By taking a look at this output while the system is under load,
> I found I was able to do bgwriter optimization that used to take me days
> of frustrating testing in hours. I've been waiting for a good guide to
> bgwriter tuning since 8.1 came out. Once you have this, combined with
> knowing how many buffers were dirty at checkpoint time because the
> bgwriter didn't get to them in time (the number you want to minimize), the
> tuning guide practically writes itself.
>
> So my question is...what log level should all this go at? Right now, I
> have the background writer stuff adjusting its level dynamically based on
> what happened; it logs at DEBUG2 if it hits the write limit (which should
> be a rare event once you're tuned properly), DEBUG3 for writes that
> scanned everything they were supposed to, and DEBUG4 if it scanned but
> didn't find anything to write. The source of checkpoint information logs
> at DEBUG1, the fsync/write info at DEBUG2.
>
> I'd like to move some of these up. On my system, I even have many of the
> checkpoint messages logged at INFO (the source of the checkpoint and the
> total write time line). It's a bit chatty, but when you get some weird
> system pause issue it makes it easy to figure out if checkpoints were to
> blame. Given how useful I feel some of these messages are to system
> tuning, and to explaining what currently appears as inexplicable pauses, I
> don't want them to be buried at DEBUG levels where people are unlikely to
> ever see them (I think some people may be concerned about turning on
> things labeled DEBUG at all). I am aware that I am too deep into this to
> have an unbiased opinion at this point though, which is why I ask for
> feedback on how to proceed here.
>
> --
> * Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2007-03-27 16:00:09 Re: Copyright question
Previous Message David Fetter 2007-03-27 15:52:35 Re: sorted results on pgbuildfarm