Re: PATCH: regular logging of checkpoint progress

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: regular logging of checkpoint progress
Date: 2011-09-19 14:55:23
Message-ID: 4E7757DB.3000601@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/05/2011 07:52 PM, Tomas Vondra wrote:
>> If your logging criteria for the write phase was "display a message any
>> time more than 30 seconds have passed since last seeing one", that would
>> give you only a few lines of output in a boring, normal
>> checkpoint--certainly less than the 9 in-progress samples you're
>> outputting now, at 10% intervals. But in the pathological situations
>> where writes are super slow, your log data would become correspondingly
>> denser, which is exactly what you want in that situation.
>>
> I still am not sure what should be a reasonable value or how to determine
> it. What happens when the checkpoint_timeout is increased, there's more
> shared_buffers etc.? What about using (checkpoint_timeout/10) for the
> time-based checkpoints and 30s for the other checkpoints?
>

That may work fine. Maybe implement it like that, and see if the amount
of logging detail is reasonable in a couple of test scenarios.

>> I think combining the two makes the most sense: "log when>=30 seconds
>> have passed since the last message, and there's been>=10% more progress
>> made". (Maybe do the progress check before the time one, to cut down on
>>
> Is this is a good idea? The case when the timeout expires and not much
> data was written is interesting, and this would not log it. But OTOH this
> would nicely solve the issue with time-based checkpoints and a fixed
> threshold.
>

One thing I am trying to avoid here is needing to check the system clock
after every buffer write. I also consider it useful to put an upper
bound on how many of these messages will appear even in the verbose
mode. This deals with both those problems.

Yes, there is a potential problem with this idea. Let's say checkpoint
writes degrade to where they take an hour. In that case, you won't see
the first progress report until 6 minutes (10%) have gone by with this
implementation. I don't see a good way to resolve that without
violating one of the other priorities I listed above though. You'll
have to poll the system clock constantly and will end up creating a lot
of log entries if you don't do a check against the % progress first.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Abbate 2011-09-19 14:58:49 Re: Is there really no interest in SQL Standard?
Previous Message Greg Stark 2011-09-19 14:54:51 Re: CUDA Sorting