Re: Experimental patch for inter-page delay in VACUUM

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)yahoo(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ang Chin Han <angch(at)bytecraft(dot)com(dot)my>, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Experimental patch for inter-page delay in VACUUM
Date: 2003-11-10 04:31:39
Message-ID: 200311100431.hAA4Vdf29615@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I would be interested to know if you have the background write process
writing old dirty buffers to kernel buffers continually if the sync()
load is diminished. What this does is to push more dirty buffers into
the kernel cache in hopes the OS will write those buffers on its own
before the checkpoint does its write/sync work. This might allow us to
reduce sync() load while preventing the need for O_SYNC/fsync().

Perhaps sync() is bad partly because the checkpoint runs through all the
dirty shared buffers and writes them all to the kernel and then issues
sync() almost guaranteeing a flood of writes to the disk. This method
would find fewer dirty buffers in the shared buffer cache, and therefore
fewer kernel writes needed by sync().

---------------------------------------------------------------------------

Jan Wieck wrote:
> Tom Lane wrote:
>
> > Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> >
> >> How I can see the background writer operating is that he's keeping the
> >> buffers in the order of the LRU chain(s) clean, because those are the
> >> buffers that most likely get replaced soon. In my experimental ARC code
> >> it would traverse the T1 and T2 queues from LRU to MRU, write out n1 and
> >> n2 dirty buffers (n1+n2 configurable), then fsync all files that have
> >> been involved in that, nap depending on where he got down the queues (to
> >> increase the write rate when running low on clean buffers), and do it
> >> all over again.
> >
> > You probably need one more knob here: how often to issue the fsyncs.
> > I'm not convinced "once per outer loop" is a sufficient answer.
> > Otherwise this is sounding pretty good.
>
> This is definitely heading into the right direction.
>
> I currently have a crude and ugly hacked system, that does checkpoints
> every minute but streches them out over the whole time. It writes out
> the dirty buffers in T1+T2 LRU order intermixed, streches out the flush
> over the whole checkpoint interval and does sync()+usleep() every 32
> blocks (if it has time to do this).
>
> This is clearly the wrong way to implement it, but ...
>
> The same system has ARC and delayed vacuum. With normal, unmodified
> checkpoints every 300 seconds, the transaction responsetime for
> new_order still peaks at over 30 seconds (5 is already too much) so the
> system basically come to a freeze during a checkpoint.
>
> Now with this high-frequent sync()ing and checkpointing by the minute,
> the entire system load levels out really nice. Basically it's constantly
> checkpointing. So maybe the thing we're looking for is to make the
> checkpoint process the background buffer writer process and let it
> checkpoint 'round the clock. Of course, with a bit more selectivity on
> what to fsync and not doing system wide sync() every 10-500 milliseconds :-)
>
>
> Jan
>
> --
> #======================================================================#
> # It's easier to get forgiveness for being wrong than for being right. #
> # Let's break this rule - forgive me. #
> #================================================== JanWieck(at)Yahoo(dot)com #
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-11-10 04:34:16 Re: Experimental patch for inter-page delay in VACUUM
Previous Message Bruce Momjian 2003-11-10 04:18:33 Re: Experimental patch for inter-page delay in VACUUM