Re: Load Distributed Checkpoints, take 3

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Load Distributed Checkpoints, take 3
Date: 2007-06-22 18:36:46
Message-ID: 467C16BE.3080809@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Tom Lane wrote:
> Maybe I misread the patch, but I thought that if someone requested an
> immediate checkpoint, the checkpoint-in-progress would effectively flip
> to immediate mode. So that could be handled by offering an immediate vs
> extended checkpoint option in pg_start_backup. I'm not sure it's a
> problem though, since as previously noted you probably want
> pg_start_backup to be noninvasive. Also, one could do a manual
> CHECKPOINT command then immediately pg_start_backup if one wanted
> as-fast-as-possible (CHECKPOINT requests immediate checkpoint, right?)

Yeah, that's possible.

>> and recovery would need to process on average 1.5 as much WAL as before.
>> Though with LDC, you should get away with shorter checkpoint intervals
>> than before, because the checkpoints aren't as invasive.
>
> No, you still want a pretty long checkpoint interval, because of the
> increase in WAL traffic due to more page images being dumped when the
> interval is short.
>
>> If we do that, we should remove bgwriter_all_* settings. They wouldn't
>> do much because we would have checkpoint running all the time, writing
>> out dirty pages.
>
> Yeah, I'm not sure that we've thought through the interactions with the
> existing bgwriter behavior.

I searched the archives a bit for the discussions when the current
bgwriter settings were born, and found this thread:

http://archives.postgresql.org/pgsql-hackers/2004-12/msg00784.php

The idea of Load Distributed Checkpoints certainly isn't new :).

Ok, if we approach this from the idea that there will be *no* GUC
variables at all to control this, and we remove the bgwriter_all_*
settings as well, does anyone see a reason why that would be bad? Here's
the ones mentioned this far:

1. we need to keep 2x as much WAL segments around as before.

2. pg_start_backup will need to wait for a long time.

3. Recovery will take longer, because the distance last committed redo
ptr will lag behind more.

1. and 3. can be alleviated by using a smaller
checkpoint_timeout/segments though as you pointed out that leads to
higher WAL traffic. 2. is not a big deal, and we can add an 'immediate'
parameter to pg_start_backup if necessary.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Greg Smith 2007-06-22 18:40:30 Re: Load Distributed Checkpoints, take 3
Previous Message Magnus Hagander 2007-06-22 16:56:06 Re: Preliminary GSSAPI Patches