Re: Publish checkpoint timing and sync files summary data to pg_stat_bgwriter

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Publish checkpoint timing and sync files summary data to pg_stat_bgwriter
Date: 2012-01-20 04:54:35
Message-ID: 4F18F38B.7080108@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/19/2012 10:52 AM, Robert Haas wrote:
> It's not quite clear from your email, but I gather that the way that
> this is intended to work is that these values increment every time we
> checkpoint?

Right--they get updated in the same atomic bump that moves up things
like buffers_checkpoint

> Also, forgive for asking this possibly-stupid question, but of what
> use is this information? I can't imagine why I'd care about a running
> total of the number of files fsync'd to disk. I also can't really
> imagine why I'd care about the length of the write phase, which surely
> will almost always be a function of checkpoint_completion_target and
> checkpoint_timeout unless I manage to overrun the number of
> checkpoint_segments I've allocated. The only number that really seems
> useful to me is the time spent syncing. I have a clear idea what to
> look for there: smaller numbers are better than bigger ones. For the
> rest I'm mystified.

Priority #1 here is to reduce (but, admittedly, not always eliminate)
the need for log file parsing of this particular area, so including all
the major bits from the existing log message that can be published this
way would include the write phase time. You mentioned one reason why
the write phase time might be interesting; there could be others. One
of the things expected here is that Munin will expand its graphing of
values from pg_stat_bgwriter to include all these fields. Most of the
time the graph of time spent in the write phase will be boring and
useless. Making it easy for a look at a graph to spot those rare times
when it isn't is one motivation for including it.

As for why to include the number of files being sync'd, one reason is
again simply wanting to include everything that can easily be
published. A second is that it helps support ideas like my "Checkpoint
sync pause" one; that's untunable in any reasonable way without some
easy way of monitoring the number of files typically sync'd. Sometimes
when I'm investigating checkpoint spikes during sync, I wonder whether
they were because more files than usual were synced, or if it's instead
just because of more churn on a smaller number. Making this easy to
graph pulls that data out to where I can compare it with disk I/O
trends. And there's precedent now proving that an always incrementing
number in pg_stat_bgwriter can be turned into such a graph easily by
monitoring tools.

> And, it doesn't seem like it's necessarily going to safe me a whole
> lot either, because if it turns out that my sync phases are long, the
> first question out of my mouth is going to be "what percentage of my
> total sync time is accounted for by the longest sync?". And so right
> there I'm back to the logs. It's not clear how such information could
> be usefully exposed in pg_stat_bgwriter either, since you probably
> want to know only the last few values, not a total over all time.

This isn't ideal yet. I mentioned how some future "performance event
logging history collector" was really needed as a place to push longest
sync times into, and we don't have it yet. This is the best thing to
instrument that I'm sure is useful, and that I can stick onto with the
existing infrastructure.

The idea is that this change makes it possible to trigger a "sync times
are too long" alert out of a tool that's based solely on database
queries. When that goes off, yes you're possibly back to the logs again
for more details about the longest individual sync time. But the rest
of the time, what's hopefully the normal state of things, you can ignore
the logs and just track the pg_stat_bgwriter numbers.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2012-01-20 05:01:46 Re: pg_upgrade with plpython is broken
Previous Message Tom Lane 2012-01-20 04:48:51 Re: Inline Extension