From:
Robert Haas <robertmhaas(at)gmail(dot)com>
To:
Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:
Re: Spread checkpoint sync
Date:
2011-02-01 18:32:28
Message-ID:
AANLkTimgvaV-sFOd6Ces4YXK_ZzEuJ=p3GkyEKubVjPH@mail.gmail.com (view raw or flat )
Thread:
2010-11-14 23:48:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-16 02:15:32 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-20 23:21:48 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-11-21 01:17:47 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-21 02:11:40 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-11-21 16:37:26 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-21 23:07:20 from Josh Berkus <josh(at)agliodbs(dot)com>
2010-11-27 01:51:05 from Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
2010-11-30 20:29:57 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-30 20:58:09 from Josh Berkus <josh(at)agliodbs(dot)com>
2011-01-12 01:27:36 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 10:47:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 12:05:57 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 14:25:40 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 14:40:59 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 15:31:05 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 21:28:17 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 22:57:02 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-16 04:14:34 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-16 07:28:58 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 16:19:20 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-18 01:46:21 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-18 10:26:51 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2011-01-16 03:35:17 from Marti Raudsepp <marti(at)juffo(dot)org>
2011-01-16 05:11:26 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-16 11:29:37 from Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>
2011-01-16 16:34:31 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-16 18:47:11 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 06:47:50 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 13:35:58 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-18 15:46:18 from Simone Aiken <saiken(at)ulfheim(dot)net>
2011-01-18 16:44:26 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 17:16:05 from "Simone Aiken" <saiken(at)ulfheim(dot)net>
2011-01-18 21:52:43 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 23:49:29 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-19 18:25:00 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 19:25:59 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-19 19:26:02 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-19 19:39:29 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 20:10:39 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-19 20:19:24 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-19 22:43:56 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-20 21:40:09 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-21 02:16:43 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-21 15:00:14 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-19 21:27:16 from "Simone Aiken" <saiken(at)ulfheim(dot)net>
2011-01-20 14:27:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 21:36:25 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-19 16:19:47 from "Simone Aiken" <saiken(at)quietlyCompetent(dot)com>
2011-01-18 13:39:05 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2011-01-18 11:19:15 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-15 13:55:21 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2011-01-15 14:15:49 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-15 20:53:18 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2011-01-17 23:07:38 from Jim Nasby <jim(at)nasby(dot)net>
2011-01-18 00:27:34 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-18 00:33:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-03 22:55:10 from Michael Banck <mbanck(at)debian(dot)org>
2011-02-04 18:43:23 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 00:32:55 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2011-01-17 01:42:13 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-17 03:13:59 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-17 03:37:33 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-17 16:40:31 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2011-01-17 18:48:54 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-18 19:12:53 from Josh Berkus <josh(at)agliodbs(dot)com>
2011-01-27 17:18:37 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-27 17:27:44 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-27 19:33:21 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-28 05:53:24 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-29 13:13:04 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 04:41:36 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 08:04:22 from Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
2011-01-31 14:44:58 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 14:52:27 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2011-01-31 16:29:56 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 16:43:01 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 16:51:13 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 16:55:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 17:11:24 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 18:44:38 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 20:27:25 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-31 21:28:01 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-02-01 17:44:05 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-01 17:58:52 from "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
2011-02-01 18:32:22 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-01 18:32:28 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-01 18:30:12 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-01 18:35:13 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-02-01 18:48:42 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-01-31 21:04:13 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-01-31 17:01:48 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2011-01-31 17:04:00 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-01-31 21:33:18 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-01 15:49:03 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-01 18:30:53 from Bruce Momjian <bruce(at)momjian(dot)us>
2011-02-04 19:08:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-04 19:39:06 from Robert Haas <robertmhaas(at)gmail(dot)com>
2011-02-07 07:07:41 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-07 15:22:15 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2011-02-07 15:44:05 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-07 18:38:34 from "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
2011-02-07 22:06:46 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-11 03:30:44 from Greg Smith <greg(at)2ndquadrant(dot)com>
2011-02-11 04:01:33 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-21 21:54:00 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-11-21 22:19:30 from Martijn van Oosterhout <kleptog(at)svana(dot)org>
2010-11-21 22:45:50 from Andres Freund <andres(at)anarazel(dot)de>
2010-11-24 02:23:49 from Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
2010-11-22 00:05:06 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-11-30 22:56:46 from Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
2010-12-01 04:25:47 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-01 08:50:14 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2010-12-01 21:30:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-02 06:11:21 from Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
2010-12-05 21:53:41 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-05 22:32:28 from Rob Wultsch <wultsch(at)gmail(dot)com>
2010-12-05 23:02:48 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-07 02:26:28 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2010-12-07 15:27:28 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-12-08 15:22:59 from Simon Riggs <simon(at)2ndQuadrant(dot)com>
2010-12-02 19:24:13 from Greg Stark <gsstark(at)mit(dot)edu>
2010-12-02 19:48:54 from Josh Berkus <josh(at)agliodbs(dot)com>
2010-12-03 00:12:16 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-12-05 05:56:51 from Greg Smith <greg(at)2ndquadrant(dot)com>
Lists:
pgsql-hackers
On Tue, Feb 1, 2011 at 12:58 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> I also think Bruce's idea of calling fsync() on each relation just
>> *before* we start writing the pages from that relation might have
>> some merit.
>
> What bothers me about that is that you may have a lot of the same
> dirty pages in the OS cache as the PostgreSQL cache, and you've just
> ensured that the OS will write those *twice*. I'm pretty sure that
> the reason the aggressive background writer settings we use have not
> caused any noticeable increase in OS disk writes is that many
> PostgreSQL writes of the same buffer keep an OS buffer page from
> becoming stale enough to get flushed until PostgreSQL writes to it
> taper off. Calling fsync() right before doing "one last push" of
> the data could be really pessimal for some workloads.
I was thinking about what Greg reported here:
http://archives.postgresql.org/pgsql-hackers/2010-11/msg01387.php
If the amount of pre-checkpoint dirty data is 3GB and the checkpoint
is writing 250MB, then you shouldn't have all that many extra
writes... but you might have some, and that might be enough to send
the whole thing down the tubes.
InnoDB apparently handles this problem by advancing the redo pointer
in small steps instead of in large jumps. AIUI, in addition to
tracking the LSN of each page, they also track the first-dirtied LSN.
That lets you checkpoint to an arbitrary LSN by flushing just the
pages with an older first-dirtied LSN. So instead of doing a
checkpoint every hour, you might do a mini-checkpoint every 10
minutes. Since the mini-checkpoints each need to flush less data,
they should be less disruptive than a full checkpoint. But that, too,
will generate some extra writes. Basically, any idea that involves
calling fsync() more often is going to tend to smooth out the I/O load
at the cost of some increase in the total number of writes.
If we don't want any increase at all in the number of writes,
spreading out the fsync() calls is pretty much the only other option.
I'm worried that even with good tuning that won't be enough to tamp
down the latency spikes. But maybe it will be...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
In response to
pgsql-hackers by date
Next :From: Robert HaasDate: 2011-02-01 18:33:56
Subject : Re: log_hostname and pg_stat_activity
Previous :From : Bruce MomjianDate : 2011-02-01 18:32:22
Subject : Re: Spread checkpoint sync