Load distributed checkpoint V4.1

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Load distributed checkpoint V4.1
Date: 2007-04-25 09:16:39
Message-ID: 20070425151044.70B6.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Here is an updated version of LDC patch (V4.1).
In this release, checkpoints finishes quickly if there is a few dirty pages
in the buffer pool following the suggestion from Heikki. Thanks.

If the last write phase was finished more quickly than the configuration,
the next nap phase is also shorten at the same rate. For example, if we
set checkpoint_write_percent = 50% and the write phase actually finished
in 25% of checkpoint time, the duration of nap time is adjusted to
checkpoint_nap_percent * 25% / 50%.

In the sync phase, we cut down the duration if there is a few files
to fsync. We assume that we have storages that throuput is at least
10 * bgwriter_all_maxpages (this is arguable). For example, when
bgwriter_delay=200ms and bgwriter_all_maxpages=5, we assume that
we can use 2MB/s of flush throughput (10 * 5page * 8kB / 200ms).
If there is 200MB of files to fsync, the duration of sync phase is
cut down to 100sec even if the duration is shorter than
checkpoint_sync_percent * checkpoint_timeout.
I use bgwriter_all_maxpages as something like 'reserved band of storage
for bgwriter' here. If there is a better name for it, please rename it.

Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:

> I guess we're fine if we do just avoid excessive waiting per the
> discussion in the next paragraph, and use a reasonable safety margin in
> the default values.
>
> >> Should we try doing something similar for the sync phase? If there's
> >> only 2 small files to fsync, there's no point sleeping for 5 minutes
> >> between them just to use up the checkpoint_sync_percent budget.
> >
> > Hmmm... if we add a new parameter like kernel_write_throughput [kB/s] and
> > clamp the maximum sleeping to size-of-segment / kernel_write_throuput (*1),
> > we can avoid unnecessary sleeping in fsync phase. Do we want to have such
> > a new parameter? I think we have many and many guc variables even now.
>
> How about using the same parameter that controls the minimum write speed
> of the write-phase (the patch used bgwriter_all_maxpages, but I
> suggested renaming it)?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
LDC_V41.patch application/octet-stream 32.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2007-04-25 09:47:57 ECPG failure on BF member Vaquita (Windows Vista)
Previous Message Zoltan Boszormenyi 2007-04-25 08:31:56 Re: [HACKERS] parser dilemma

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2007-04-25 10:45:22 Re: Load distributed checkpoint V4.1
Previous Message Zoltan Boszormenyi 2007-04-25 08:31:56 Re: [HACKERS] parser dilemma