Re: Load Distributed Checkpoints, take 3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Load Distributed Checkpoints, take 3
Date: 2007-06-25 23:00:44
Message-ID: 7621.1182812444@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> The way transitions between completely idle and all-out bursts happen were
> one problematic area I struggled with. Since the LRU point doesn't move
> during the idle parts, and the lingering buffers have a usage_count>0, the
> LRU scan won't touch them; the only way to clear out a bunch of dirty
> buffers leftover from the last burst is with the all-scan.

One thing that might be worth changing is that right now, BgBufferSync
starts over from the current clock-sweep point on each call --- that is,
each bgwriter cycle. So it can't really be made to write very many
buffers without excessive CPU work. Maybe we should redefine it to have
some static state carried across bgwriter cycles, such that it would
write at most N dirty buffers per call, but scan through X percent of
the buffers, possibly across several calls, before returning to the (by
now probably advanced) clock-sweep point. This would allow a larger
value of X to be used than is currently practical. You might wish to
recheck the clock sweep point on each iteration just to make sure the
scan hasn't fallen behind it, but otherwise I don't see any downside.
The scenario where somebody re-dirties a buffer that was cleaned by the
bgwriter scan isn't a problem, because that buffer will also have had its
usage_count increased and thereby not be a candidate for replacement.

> As a general comment on this subject, a lot of the work in LDC presumes
> you have an accurate notion of how close the next checkpoint is.

Yeah; this is one reason I was interested in carrying some write-speed
state across checkpoints instead of having the calculation start from
scratch each time. That wouldn't help systems that sit idle a long time
and suddenly go nuts, but it seems to me that smoothing the write rate
across more than one checkpoint could help if the fluctuations occur
over a timescale of a few checkpoints.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-06-25 23:52:22 Re: Maintaining cluster order on insert
Previous Message Greg Smith 2007-06-25 22:04:16 Re: Load Distributed Checkpoints, take 3