Re: Experimental patch for inter-page delay in VACUUM

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Experimental patch for inter-page delay in VACUUM
Date: 2003-11-04 15:51:02
Message-ID: 3FA7CAE6.1040402@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

>Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>
>
>>What still needs to be addressed is the IO storm cause by checkpoints. I
>>see it much relaxed when stretching out the BufferSync() over most of
>>the time until the next one should occur. But the kernel sync at it's
>>end still pushes the system hard against the wall.
>>
>>
>
>I have never been happy with the fact that we use sync(2) at all. Quite
>aside from the "I/O storm" issue, sync() is really an unsafe way to do a
>checkpoint, because there is no way to be certain when it is done. And
>on top of that, it does too much, because it forces syncing of files
>unrelated to Postgres.
>
>I would like to see us go over to fsync, or some other technique that
>gives more certainty about when the write has occurred. There might be
>some scope that way to allow stretching out the I/O, too.
>
>The main problem with this is knowing which files need to be fsync'd.
>The only idea I have come up with is to move all buffer write operations
>into a background writer process, which could easily keep track of
>every file it's written into since the last checkpoint. This could cause
>problems though if a backend wants to acquire a free buffer and there's
>none to be had --- do we want it to wait for the background process to
>do something? We could possibly say that backends may write dirty
>buffers for themselves, but only if they fsync them immediately. As
>long as this path is seldom taken, the extra fsyncs shouldn't be a big
>performance problem.
>
>Actually, once you build it this way, you could make all writes
>synchronous (open the files O_SYNC) so that there is never any need for
>explicit fsync at checkpoint time. The background writer process would
>be the one incurring the wait in most cases, and that's just fine. In
>this way you could directly control the rate at which writes are issued,
>and there's no I/O storm at all. (fsync could still cause an I/O storm
>if there's lots of pending writes in a single file.)
>
>
>
Or maybe fdatasync() would be slightly more efficient - do we care about
flushing metadata that much?

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-11-04 15:58:46 Re: Experimental patch for inter-page delay in VACUUM
Previous Message Jan Wieck 2003-11-04 15:45:22 Re: Experimental patch for inter-page delay in VACUUM