Re: Load distributed checkpoint

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Load distributed checkpoint
Date: 2006-12-28 17:50:19
Message-ID: 200612281750.kBSHoJO14313@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro wrote:
>
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> > > 566.973777
> > > 327.158222 <- (1) write()
> > > 560.773868 <- (2) sleep
> > > 544.106645 <- (3) fsync()
> >
> > OK, so you are saying that performance dropped only during the write(),
> > and not during the fsync()? Interesting.
>
> Almost yes, but there is a small drop in fsync. (560->540)
>
>
> > I would like to know the
> > results of a few tests just like you reported them above:
> >
> > 1a) write spread out over 30 seconds
> > 1b) write with no delay
> >
> > 2a) sleep(0)
> > 2b) sleep(30)
> >
> > 3) fsync
> >
> > I would like to know the performance at each stage for each combination,
> > e.g. when using 1b, 2a, 3, performance during the write() phase was X,
> > during the sleep it was Y, and during the fsync it was Z. (Of course,
> > sleep(0) has no stage timing.)
>
> I'm thinking about generalizing your idea; Adding three parameters
> (checkpoint_write, checkpoint_naptime and checkpoint_fsync)
> to control sleeps in each stage.
>
> 1) write() spread out over 'checkpoint_write' seconds.
> 2) sleep 'checkpoint_naptime' seconds between write() and fsync().
> 3) fsync() spread out over 'checkpoint_fsync' seconds.
>
> If three parameter are all zero, checkpoints behave as the same as now.
> If checkpoint_write = checkpoint_timeout and other two are zero,
> it is just like my proposal before.
>
>
> As you might expect, I intend the above only for development purpose.
> Additinal three parameters are hard to use for users. If we can pull out
> some proper values from the tests, we'd better to set those values as
> default. I assume we can derive them from existing checkpoint_timeout.

Great idea, though I wouldn't bother with checkpoint_fsync. I think
Simon's previous email spelled out the problems of trying to delay
fsyncs() --- in most cases, there will be one file with most of the I/O,
and that fsync is going to be the flood. Basically, I think the
variability of table access is too great for the fsync delay to ever be
tunable by users.

To summarize, if we could have fsync() only write the dirty buffers that
happened as part of the checkpoint, we could delay the write() for the
entire time between checkpoints, but we can't do that, so we have to
make it user-tunable.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2006-12-28 17:54:57 Re: Load distributed checkpoint
Previous Message Joshua D. Drake 2006-12-28 17:42:56 TODO: Particularly, move GPL-licensed /contrib/userlock and /contrib/dbmirror/clean_pending.pl.

Browse pgsql-patches by date

  From Date Subject
Next Message Jim C. Nasby 2006-12-28 17:54:57 Re: Load distributed checkpoint
Previous Message Tom Lane 2006-12-28 15:24:54 Re: Recent SIGSEGV failures in buildfarm HEAD