Re: Load Distributed Checkpoints, take 3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Load Distributed Checkpoints, take 3
Date: 2007-06-21 14:47:31
Message-ID: 16289.1182437251@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> I don't think you understand how the settings work. Did you read the
> documentation? If you did, it's apparently not adequate.

I did read the documentation, and I'm not complaining that I don't
understand it. I'm complaining that I don't like the presented API
because it's self-inconsistent. You've got two parameters that are in
effect upper and lower bounds for the checkpoint write rate, but they
are named inconsistently and not even measured in the same kind of unit.
Nor do I agree that the inconsistency buys any ease of use.

> The main tuning knob is checkpoint_smoothing, which is defined as a
> fraction of the checkpoint interval (both checkpoint_timeout and
> checkpoint_segments are taken into account). Normally, the write phase
> of a checkpoint takes exactly that much time.

So the question is, why in the heck would anyone want the behavior that
"checkpoints take exactly X time"?? The useful part of this whole patch
is to cap the write rate at something that doesn't interfere too much
with foreground queries. I don't see why people wouldn't prefer
"checkpoints can take any amount of time up to the checkpoint interval,
but we do our best not to exceed Y writes/second".

Basically I don't see what useful values checkpoint_smoothing would have
other than 0 and 1. You might as well make it a bool.

> There's another possible strategy: keep the I/O rate constant, but vary
> the length of the checkpoint. checkpoint_rate allows you to do that.

But only from the lower side.

> Now how would you replace checkpoint_smoothing with a max I/O rate?

I don't see why you think that's hard. It looks to me like the
components of the decision are the same numbers in any case: you have to
estimate your progress towards checkpoint completion, your available
time till next checkpoint, and your write rate. Then you either delay
or not.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2007-06-21 15:40:03 Re: Load Distributed Checkpoints, take 3
Previous Message Heikki Linnakangas 2007-06-21 14:27:49 Re: Load Distributed Checkpoints, take 3