Re: Experimental patch for inter-page delay in VACUUM

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Experimental patch for inter-page delay in VACUUM
Date: 2003-11-10 15:05:04
Message-ID: 3FAFA920.1050207@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:

> Jan Wieck wrote:
>> Bruce Momjian wrote:
>>
>> > Now, O_SYNC is going to force every write to the disk. If we have a
>> > transaction that has to write lots of buffers (has to write them to
>> > reuse the shared buffer)
>>
>> So make the background writer/checkpointer keeping the LRU head clean. I
>> explained that 3 times now.
>
> If the background cleaner has to not just write() but write/fsync or
> write/O_SYNC, it isn't going to be able to clean them fast enough. It
> creates a bottleneck where we didn't have one before.
>
> We are trying to eliminate an I/O storm during checkpoint, but the
> solutions seem to be making the non-checkpoint times slower.
>

It looks as if you're assuming that I am making the backends unable to
write on their own, so that they have to wait on the checkpointer. I
never said that.

If the checkpointer keeps the LRU heads clean, that lifts off write load
from the backends. Sure, they will be able to dirty pages faster.
Theoretically, because in practice if you have a reasonably good cache
hitrate, they will just find already dirty buffers where they just add
some more dust.

If after all the checkpointer (doing write()+whateversync) is not able
to keep up with the speed of buffers getting dirtied, the backends will
have to do some write()'s again, because they will eat up the clean
buffers at the LRU head and pass the checkpointer.

Also please notice another little change in behaviour. The old code just
went through the buffer cache sequentially, possibly flushing buffers
that got dirtied after the checkpoint started, which is way ahead of
time (they need to be flushed for the next checkpoint, not now). That
means, that if the same buffer gets dirtied again after that, we wasted
a full disk write on it. My new code creates a list of dirty blocks at
the beginning of the checkpoint, and flushes only those that are still
dirty at the time it gets to them.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2003-11-10 15:18:38 Re: Experimental patch for inter-page delay in VACUUM
Previous Message Marc G. Fournier 2003-11-10 14:49:50 RC2 tag'd and bundled ...