From:
Robert Haas <robertmhaas(at)gmail(dot)com>
To:
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:
Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:
Re: Spread checkpoint sync
Date:
2011-02-01 17:44:05
Message-ID:
AANLkTi=CbZMgxg=SX=H=Vpe=SCTT03OopQEhvJbfGEhd@mail.gmail.com (view raw or flat )
Thread:
2010-11-14 23:48:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-16 02:15:32 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-20 23:21:48 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-11-21 01:17:47 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-21 02:11:40 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-11-21 16:37:26 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-21 23:07:20 from Josh Berkus <josh(at)agliodbs(dot)com>
2010-11-27 01:51:05 from Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
2010-11-30 20:29:57 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-30 20:58:09 from Josh Berkus <josh(at)agliodbs(dot)com>
2011-01-12 01:27:36 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 10:47:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 12:05:57 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 14:25:40 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 14:40:59 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 15:31:05 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 21:28:17 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 22:57:02 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-16 04:14:34 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-16 07:28:58 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 16:19:20 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-18 01:46:21 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-18 10:26:51 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2011-01-16 03:35:17 from Marti Raudsepp <marti(at)juffo(dot)org>
2011-01-16 05:11:26 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-16 11:29:37 from Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
2011-01-16 16:34:31 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-16 18:47:11 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 06:47:50 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 13:35:58 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-18 15:46:18 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 16:44:26 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 17:16:05 from "Simone Aiken" <saiken(at)ulfheim(dot)net>
2011-01-18 21:52:43 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 23:49:29 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-19 18:25:00 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 19:25:59 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-19 19:26:02 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-19 19:39:29 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 20:10:39 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-19 20:19:24 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 22:43:56 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-20 21:40:09 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-21 02:16:43 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-21 15:00:14 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-19 21:27:16 from "Simone Aiken" <saiken(at)ulfheim(dot)net>
2011-01-20 14:27:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 21:36:25 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-19 16:19:47 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-18 13:39:05 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-18 11:19:15 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 13:55:21 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2011-01-15 14:15:49 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 20:53:18 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2011-01-17 23:07:38 from Jim Nasby <jim(at)nasby(dot)net>
2011-01-18 00:27:34 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 00:33:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-03 22:55:10 from Michael Banck <mbanck(at)debian(dot)org>
2011-02-04 18:43:23 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 00:32:55 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2011-01-17 01:42:13 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-17 03:13:59 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 03:37:33 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-17 16:40:31 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2011-01-17 18:48:54 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-18 19:12:53 from Josh Berkus <josh(at)agliodbs(dot)com>
2011-01-27 17:18:37 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-27 17:27:44 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-27 19:33:21 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-28 05:53:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-29 13:13:04 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 04:41:36 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 08:04:22 from Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
2011-01-31 14:44:58 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 14:52:27 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2011-01-31 16:29:56 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 16:43:01 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 16:51:13 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 16:55:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 17:11:24 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 18:44:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 20:27:25 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-31 21:28:01 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-02-01 17:44:05 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-01 17:58:52 from "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
2011-02-01 18:32:22 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-01 18:32:28 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-01 18:30:12 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-01 18:35:13 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-02-01 18:48:42 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-31 21:04:13 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-31 17:01:48 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 17:04:00 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 21:33:18 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-01 15:49:03 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-01 18:30:53 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-04 19:08:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-04 19:39:06 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-07 07:07:41 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-07 15:22:15 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2011-02-07 15:44:05 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-07 18:38:34 from "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
2011-02-07 22:06:46 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-11 03:30:44 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-11 04:01:33 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-21 21:54:00 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-21 22:19:30 from Martijn van Oosterhout <kleptog(at)svana(dot)org>
2010-11-21 22:45:50 from Andres Freund <andres(at)anarazel(dot)de>
2010-11-24 02:23:49 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2010-11-22 00:05:06 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-30 22:56:46 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-12-01 04:25:47 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-01 08:50:14 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2010-12-01 21:30:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-02 06:11:21 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2010-12-05 21:53:41 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-05 22:32:28 from Rob Wultsch <wultsch(at)gmail(dot)com>
2010-12-05 23:02:48 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-07 02:26:28 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2010-12-07 15:27:28 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-08 15:22:59 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2010-12-02 19:24:13 from Greg Stark <gsstark(at)mit(dot)edu>
2010-12-02 19:48:54 from Josh Berkus <josh(at)agliodbs(dot)com>
2010-12-03 00:12:16 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-12-05 05:56:51 from Greg Smith <greg(at)2ndquadrant(dot)com>
Lists:
pgsql-hackers
On Mon, Jan 31, 2011 at 4:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Back to the idea at hand - I proposed something a bit along these
>> lines upthread, but my idea was to proactively perform the fsyncs on
>> the relations that had gone the longest without a write, rather than
>> the ones with the most dirty data.
>
> Yeah. What I meant to suggest, but evidently didn't explain well, was
> to use that or something much like it as the rule for deciding *what* to
> fsync next, but to use amount-of-unsynced-data-versus-threshold as the
> method for deciding *when* to do the next fsync.
Oh, I see. Yeah, that could be a good algorithm.
I also think Bruce's idea of calling fsync() on each relation just
*before* we start writing the pages from that relation might have some
merit. (I'm assuming here that we are sorting the writes.) That
should tend to result in the end-of-checkpoint fsyncs being quite
fast, because we'll only have as much dirty data floating around as we
actually wrote during the checkpoint, which according to Greg Smith is
usually a small fraction of the total data in need of flushing. Also,
if one of the pre-write fsyncs takes a long time, then that'll get
factored into our calculations of how fast we need to write the
remaining data to finish the checkpoint on schedule. Of course
there's still the possibility that the I/O system literally can't
finish a checkpoint in X minutes, but even in that case, the I/O
saturation will hopefully be more spread out across the entire
checkpoint instead of falling like a hammer at the very end.
Back to your idea: One problem with trying to bound the unflushed data
is that it's not clear what the bound should be. I've had this mental
model where we want the OS to write out pages to disk, but that's not
always true, per Greg Smith's recent posts about Linux kernel tuning
slowing down VACUUM. A possible advantage of the Momjian algorithm
(as it's known in the literature) is that we don't actually start
forcing anything out to disk until we have a reason to do so - namely,
an impending checkpoint.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
In response to
Responses
pgsql-hackers by date
Next :From: Robert HaasDate: 2011-02-01 17:47:03
Subject : Re: FPI
Previous :From : Tom LaneDate : 2011-02-01 17:41:38
Subject : Re: FPI