Re: Instrument checkpoint sync calls

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Instrument checkpoint sync calls
Date: 2010-11-30 16:38:13
Message-ID: 4CF52875.9040806@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Janes wrote:
> For the individual file sync times emitted under debug1, it would be
> very handy if the file being synced was identified, for example
> "relation base/16384/16523". Rather than being numbered sequentially
> within a given checkpoint.
>

I was numbering them sequentially so that it's straightforward to graph
the sync times in an external analysis tool, but the relation data is
helpful too. New patch reflecting all upthread suggestions is
attached. The output looks like this now at DEBUG1:

LOG: checkpoint starting: xlog
DEBUG: checkpoint sync: number=1 file=base/16424/11645
time=11589.549000 msec
DEBUG: checkpoint sync: number=2 file=base/16424/16438 time=16.148000 msec
DEBUG: checkpoint sync: number=3 file=base/16424/16437 time=53.530000 msec
DEBUG: checkpoint sync: number=4 file=base/16424/16447 time=10.214000 msec
DEBUG: checkpoint sync: number=5 file=base/16424/11607 time=1.499000 msec
DEBUG: checkpoint sync: number=6 file=base/16424/16425_fsm
time=2.921000 msec
DEBUG: checkpoint sync: number=7 file=base/16424/16437.1 time=4.237000 msec
DEBUG: checkpoint sync: number=8 file=base/16424/16428_fsm
time=1.654000 msec
DEBUG: checkpoint sync: number=9 file=base/16424/16442 time=7.920000 msec
DEBUG: checkpoint sync: number=10 file=base/16424/16428_vm
time=2.613000 msec
DEBUG: checkpoint sync: number=11 file=base/16424/11618 time=1.468000 msec
DEBUG: checkpoint sync: number=12 file=base/16424/16437_fsm
time=2.638000 msec
DEBUG: checkpoint sync: number=13 file=base/16424/16428 time=2.883000 msec
DEBUG: checkpoint sync: number=14 file=base/16424/16425 time=3.369000 msec
DEBUG: checkpoint sync: number=15 file=base/16424/16437_vm
time=8.686000 msec
DEBUG: checkpoint sync: number=16 file=base/16424/16425_vm
time=5.984000 msec
LOG: checkpoint complete: wrote 2074 buffers (50.6%); 0 transaction log
file(s) added, 0 removed, 3 recycled; write=0.617 s, sync=11.715 s,
total=22.167 s; sync files=16, longest=11.589 s, average=0.724 s

I kept the units for the DEBUG level ones in msec because that's a
better scale for the common really short syncs during good behavior.
But the summary info in seconds now appears at the end of the existing
"checkpoint complete" message, so only one line to parse for those
looking to analyze the gross checkpoint data. That looks to work well
enough for finding situations like the big ext3 spikes. You can easily
see one in this example by the fact that "longest=11.589 s" is almost
the entirety of "sync=11.715 s". That's the really key thing there's
currently no visibility into, that's made obvious with this patch.

This might be ready for some proper review now. I know there's at least
one blatant bug still in here I haven't found yet, related to how the
averages are computed. I saw this once:

LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s,
total=0.001 s; sync files=0, longest=0.000 s,
average=-9223372036854775808.-2147483 s

After an immediate checkpoint, so at least one path not quite right yet.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD

Attachment Content-Type Size
log-sync-v3.patch text/x-patch 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-11-30 16:40:10 Re: crash-safe visibility map, take three
Previous Message Tom Lane 2010-11-30 16:37:13 Re: crash-safe visibility map, take three