Re: PATCH: regular logging of checkpoint progress

From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Greg Smith" <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-05 23:52:52
Message-ID: e6328725ee5af0897c640eb3529a1a18.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3 Září 2011, 8:19, Greg Smith wrote:
> If you're expanding log_checkpoints to an enum, for that to handle what
> I think everybody might ever want (for what checkpoints do now at
> least), I'd find that more useful if it happened like this instead:
>
> log_checkpoints = {off, on, write, sync, verbose}
>
> I don't think you should change the semantics of off/on, which will
> avoid breaking existing postgresql.conf files and resources that suggest
> tuning advice. "write" can toggle on what you're adding; "sync" should
> control whether the DEBUG1 messages showing the individual file names in
> the sync phase appear; and "verbose" can include both.

Thanks, those are definitely good ideas extending the original patch and
making it much more useful I guess.

> As far as a heuristic for making this less chatty when there's nothing
> exciting happening goes, I think something based on how much time has
> passed would be the best one. In your use case, I would guess you don't
> really care whether a message appears every n%. If I understand you
> correctly now, you would mainly care about getting enough log detail to
> know 1) when things are running really slow, or b) often enough that the
> margin of error in your benchmark results from unaccounted checkpoint
> writes is acceptable. In both of those cases, I'd think a time-based
> threshold would be appropriate, and that also deals with the time-based
> checkpoints, too.

Yes, the time-based threshold seems like the right solution.

> If your logging criteria for the write phase was "display a message any
> time more than 30 seconds have passed since last seeing one", that would
> give you only a few lines of output in a boring, normal
> checkpoint--certainly less than the 9 in-progress samples you're
> outputting now, at 10% intervals. But in the pathological situations
> where writes are super slow, your log data would become correspondingly
> denser, which is exactly what you want in that situation.

I still am not sure what should be a reasonable value or how to determine
it. What happens when the checkpoint_timeout is increased, there's more
shared_buffers etc.? What about using (checkpoint_timeout/10) for the
time-based checkpoints and 30s for the other checkpoints?

> I think combining the two makes the most sense: "log when >=30 seconds
> have passed since the last message, and there's been >=10% more progress
> made". (Maybe do the progress check before the time one, to cut down on

Is this is a good idea? The case when the timeout expires and not much
data was written is interesting, and this would not log it. But OTOH this
would nicely solve the issue with time-based checkpoints and a fixed
threshold.

Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Colson 2011-09-05 23:56:24 Re: Review: prepare plans of embedded sql on function start
Previous Message Bruce Momjian 2011-09-05 23:33:09 Re: Couple document fixes