Re: [HACKERS] Sync vs. fsync during checkpoint

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Win32 port list <pgsql-hackers-win32(at)postgresql(dot)org>
Subject: Re: [HACKERS] Sync vs. fsync during checkpoint
Date: 2004-02-09 14:33:09
Message-ID: 40279A25.6020600@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-hackers-win32

Bruce Momjian wrote:

> Jan Wieck wrote:
>> Tom Lane wrote:
>>
>> > "Zeugswetter Andreas SB SD" <ZeugswetterA(at)spardat(dot)at> writes:
>> >> So Imho the target should be to have not much IO open for the checkpoint,
>> >> so the fsync is fast enough, even if serial.
>> >
>> > The best we can do is push out dirty pages with write() via the bgwriter
>> > and hope that the kernel will see fit to write them before checkpoint
>> > time arrives. I am not sure if that hope has basis in fact or if it's
>> > just wishful thinking. Most likely, if it does have basis in fact it's
>> > because there is a standard syncer daemon forcing a sync() every thirty
>> > seconds.
>>
>> Looking at the response time charts I did for showing how vacuum delay
>> is doing, it seems at least on Linux there is hope that that is the
>> case. Those charts have just a regular 5 minute checkpoint with enough
>> checkpoint segments for that, and no other sync effort done at all.
>>
>> The system has a hard time to handle a larger scaled test DB, so it is
>> definitely well saturated with IO. The charts are here:
>>
>> http://developer.postgresql.org/~wieck/vacuum_cost/
>>
>> >
>> > That means that instead of an I/O storm every checkpoint interval,
>> > we get a smaller I/O storm every 30 seconds. Not sure this is a big
>> > improvement. Jan already found out that issuing very frequent sync()s
>> > isn't a win.
>>
>> In none of those charts I can see any checkpoint caused IO storm any
>> more. Charts I'm currently doing for 7.4.1 show extremely clear spikes
>> at checkpoints. If someone is interested in those as well I will put
>> them up.
>
> So, Jan, are you basically saying that the background writer has solved
> the checkpoint I/O flood problem, and we just need to deal with changing
> sync to multiple fsync's at checkpoint?

ISTM that the background writer at least has the ability to lower the
impact of a checkpoint significantly enough that one might not care
about it any more. "Has the ability" means, it needs to be adjusted to
the actual DB usage. The charts I produced where not done with the
default settings, but rather after making the bgwriter a bit more
agressive against dirty pages.

The whole sync() vs. fsync() discussion is in my opinion nonsense at
this point. Without the ability to limit the amount of files to a
reasonable number, by employing tablespaces in the form of larger
container files, the risk of forcing excessive head movement is simply
too high.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2004-02-09 14:33:11 Re: Transaction aborts on syntax error.
Previous Message Rod Taylor 2004-02-09 13:29:15 Re: RFC: Very large scale postgres support

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Tom Lane 2004-02-09 16:26:43 Re: [HACKERS] Sync vs. fsync during checkpoint
Previous Message Magnus Hagander 2004-02-08 22:25:22 Re: [PATCHES] Updated win32 readdir patch