Re: Load distributed checkpoint

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Load distributed checkpoint
Date: 2006-12-12 17:38:18
Message-ID: 878xhdj9px.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


> Tom Lane wrote:
>>
>> I like Kevin's settings better than what Jim suggests. If the bgwriter
>> only makes one sweep between checkpoints then it's hardly going to make
>> any impact at all on the number of dirty buffers the checkpoint will
>> have to write. The point of the bgwriter is to reduce the checkpoint
>> I/O spike by doing writes between checkpoints, and to have any
>> meaningful impact on that, you'll need it to make the cycle several times.
>>
>> Another point here is that you want checkpoints to be pretty far apart
>> to minimize the WAL load from full-page images. So again, a bgwriter
>> that's only making one loop per checkpoint is not gonna be doing much.

I missed the previous message but it sounds like you're operating under a
different set of assumptions than the original poster. If you do a single
sweep through all of the buffers *and sync them* then you've just finished a
checkpoint -- the *previous* checkpoint. Not the subsequent one.

That is, rather than trying to spread the load of the checkpoint out by
getting the writes into the kernel sooner but make no attempt to sync them
until checkpoint time, start the checkpoint as soon as the previous checkpoint
finishes, and dribble the blocks of the checkpoint out slowly throughout an
entire checkpoint cycle syncing them immediately using O_SYNC/ODIRECT.

It's a fundamental shift in the idea of the purpose of bgwriter. Instead of
trying to suck i/o away from the subsequent checkpoint it would be responsible
for all the i/o of the previous checkpoint which would still be in progress
for the entire time of checkpoint_timeout. It would only complete when
bgwriter had finished doing its one full sweep.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-12-12 17:47:50 Re: Load distributed checkpoint
Previous Message Bruce Momjian 2006-12-12 17:15:06 Re: Load distributed checkpoint

Browse pgsql-patches by date

  From Date Subject
Next Message Belinda M. Giardine 2006-12-12 17:39:00 Re: date comparisons
Previous Message Tom Lane 2006-12-12 17:25:22 Re: date comparisons