Re: Checkpoint sync pause

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checkpoint sync pause
Date: 2012-01-16 16:00:15
Message-ID: CA+TgmoaxbamSfWyA4s3-G1QOMMB4tP7vKhXRAmrAa2Ofms9Qvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 16, 2012 at 2:57 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> ...
> 2012-01-16 02:39:01.184 EST [25052]: DEBUG:  checkpoint sync: number=34
> file=base/16385/11766 time=0.006 msec
> 2012-01-16 02:39:01.184 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=3
> 2012-01-16 02:39:01.284 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=2
> 2012-01-16 02:39:01.385 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=1
> 2012-01-16 02:39:01.860 EST [25052]: DEBUG:  checkpoint sync: number=35
> file=global/12007 time=375.710 msec
> 2012-01-16 02:39:01.860 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=3
> 2012-01-16 02:39:01.961 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=2
> 2012-01-16 02:39:02.061 EST [25052]: DEBUG:  checkpoint sync delay: seconds
> left=1
> 2012-01-16 02:39:02.161 EST [25052]: DEBUG:  checkpoint sync: number=36
> file=base/16385/11754 time=0.008 msec
> 2012-01-16 02:39:02.555 EST [25052]: LOG:  checkpoint complete: wrote 2586
> buffers (63.1%); 1 transaction log file(s) added, 0 removed, 0 recycled;
> write=2.422 s, sync=13.282 s, total=16.123 s; sync files=36, longest=1.085
> s, average=0.040 s
>
> No docs yet, really need a better guide to tuning checkpoints as they exist
> now before there's a place to attach a discussion of this to.

Yeah, I think this is an area where a really good documentation patch
might help more users than any code we could write. On the technical
end, I dislike this a little bit because the parameter is clearly
something some people are going to want to set, but it's not at all
clear what value they should set it to and it has complex interactions
with the other checkpoint settings - and the user's hardware
configuration. If there's no way to make it more self-tuning, then
perhaps we should just live with that, but it would be nice to come up
with something more user-transparent. Also, I am still struggling
with what the right benchmarking methodology even is to judge whether
any patch in this area "works". Can you provide more details about
your test setup?

Just one random thought: I wonder if it would make sense to cap the
delay after each sync to the time spending performing that sync. That
would make the tuning of the delay less sensitive to the total number
of files, because we won't unnecessarily wait after each sync when
they're not actually taking any time to complete. It's probably
easier to estimate the number of segments that are likely to contain
lots of dirty data than to estimate the total number of segments that
you might have touched at least once since the last checkpoint, and
there's no particular reason to think the latter is really what you
should be tuning on anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-01-16 16:01:50 Re: pgstat documentation tables
Previous Message Robert Haas 2012-01-16 15:52:49 Re: Standalone synchronous master