Skip site navigation (1) Skip section navigation (2)

Re: Experimental patch for inter-page delay in VACUUM

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Experimental patch for inter-page delay in VACUUM
Date: 2003-11-04 15:51:02
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
Tom Lane wrote:

>Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>>What still needs to be addressed is the IO storm cause by checkpoints. I 
>>see it much relaxed when stretching out the BufferSync() over most of 
>>the time until the next one should occur. But the kernel sync at it's 
>>end still pushes the system hard against the wall.
>I have never been happy with the fact that we use sync(2) at all.  Quite
>aside from the "I/O storm" issue, sync() is really an unsafe way to do a
>checkpoint, because there is no way to be certain when it is done.  And
>on top of that, it does too much, because it forces syncing of files
>unrelated to Postgres.
>I would like to see us go over to fsync, or some other technique that
>gives more certainty about when the write has occurred.  There might be
>some scope that way to allow stretching out the I/O, too.
>The main problem with this is knowing which files need to be fsync'd.
>The only idea I have come up with is to move all buffer write operations
>into a background writer process, which could easily keep track of
>every file it's written into since the last checkpoint.  This could cause
>problems though if a backend wants to acquire a free buffer and there's
>none to be had --- do we want it to wait for the background process to
>do something?  We could possibly say that backends may write dirty
>buffers for themselves, but only if they fsync them immediately.  As
>long as this path is seldom taken, the extra fsyncs shouldn't be a big
>performance problem.
>Actually, once you build it this way, you could make all writes
>synchronous (open the files O_SYNC) so that there is never any need for
>explicit fsync at checkpoint time.  The background writer process would
>be the one incurring the wait in most cases, and that's just fine.  In
>this way you could directly control the rate at which writes are issued,
>and there's no I/O storm at all.  (fsync could still cause an I/O storm
>if there's lots of pending writes in a single file.)
Or maybe fdatasync() would be slightly more efficient - do we care about 
flushing metadata that much?



In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2003-11-04 15:58:46
Subject: Re: Experimental patch for inter-page delay in VACUUM
Previous:From: Jan WieckDate: 2003-11-04 15:45:22
Subject: Re: Experimental patch for inter-page delay in VACUUM

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group