Re: Redesigning checkpoint_segments

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: hlinnaka <hlinnaka(at)iki(dot)fi>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Venkata Balaji N <nag1010(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Redesigning checkpoint_segments
Date: 2015-06-26 12:40:35
Message-ID: CA+TgmoY5MBJDM6n1=6TNV=36gEcbPbp_GXqM5SDt5DMtogrd-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 26, 2015 at 7:08 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> I'm not sure what to do about this. With the attached patch, you get the
> same leisurely pacing with restartpoints as you get with checkpoints, but
> you exceed max_wal_size during recovery, by the amount determined by
> checkpoint_completion_target. Alternatively, we could try to perform
> restartpoints faster then checkpoints, but then you'll get nasty checkpoint
> I/O storms in recovery.
>
> A bigger change would be to write a WAL record at the beginning of a
> checkpoint. It wouldn't do anything else, but it would be a hint to recovery
> that there's going to be a checkpoint record later whose redo-pointer will
> point to that record. We could then start the restartpoint at that record
> already, before seeing the checkpoint record itself.
>
> I think the attached is better than nothing, but I'll take a look at that
> beginning-of-checkpoint idea. It might be too big a change to do at this
> point, but I'd really like to fix this properly for 9.5, since we've changed
> with the way checkpoints are scheduled anyway.

I agree. Actually, I've seen a number of presentations indicating
that the pacing of checkpoints is already too aggressive near the
beginning, because as soon as we initiate the checkpoint we have a
storm of full page writes. I'm sure we can come up with arbitrarily
complicated systems to compensate for this, but something simple might
be to calculate progress done+adjust/total+adjust rather than
done/total. If you let adjust=total/9, for example, then you
essentially start the progress meter at 10% instead of 0%. Even
something that simple might be an improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-06-26 12:56:55 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Heikki Linnakangas 2015-06-26 11:08:13 Re: Redesigning checkpoint_segments