Re: RC2 and open issues

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: RC2 and open issues
Date: 2004-12-27 22:56:33
Message-ID: 200412272256.iBRMuXf14798@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Greg Stark wrote:
>
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Suppose that you run a checkpoint every 5 minutes, and with the knob
> > you slow down the checkpoint to extend over say 3 minutes on average,
> > rather than the normal blast-it-out-as-fast-as-possible. Then you'll
> > be keeping an average of 8 minutes worth of WAL files instead of 5.
> > Not exactly a killer objection.
>
> Right. I was thinking that the goal would be to spread the checkpoint out over
> exactly the checkpoint interval, minus some safety factor. So if it has some
> estimate of the total number of dirty buffers that need flushing it could just
> divide the checkpoint interval by that and calculate the delay needed to
> finish in some fraction of the checkpoint interval, 60% seems like a
> reasonable guess.
>
> > One issue is that while we can regulate the rate at which we issue
> > write()s, we still have to issue fsync()s at the end, and we can't
> > control what happens in response to those. It's quite possible that
> > all the I/O would happen in response to the fsync()s anyway, in which
> > case the whole exercise would be a waste of time.
>
> Well you could fsync earlier as well, say just before whenever you sleep.
> Obviously the delay on the checkpoint process doesn't matter to performance if
> it's about to sleep. It could end up scheduling i/o earlier than necessary and
> cause redundant seeks but then I guess that's an inherent tension between
> trying to spread out the i/o evenly and trying to get the ideal ordering of
> i/o.

It certainly is an interesting idea to have the checkpoint span a longer
time period. We couldn't do that with sync, but now that we fsync each
file it is possible.

It would be easy do this if we didn't also need the fsync. The original
idea was that we would write() the dirty buffers long before the
checkpoint, and the kernel would write many of these dirty buffers
before we got to checkpoint time.

We could go with the checkpoint clock sweep idea but then we aren't
writing them but actually doing write/fsync a lot more. I can't think
of a way this would be a win.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2004-12-27 23:38:06 Re: displaying contents
Previous Message Bruce Momjian 2004-12-27 22:30:31 Re: Bgwriter behavior

Browse pgsql-patches by date

  From Date Subject
Next Message John Hansen 2004-12-28 07:23:50 Re: Bgwriter behavior
Previous Message Bruce Momjian 2004-12-27 22:30:31 Re: Bgwriter behavior