Re: Improvement of checkpoint IO scheduler for stable transaction responses

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improvement of checkpoint IO scheduler for stable transaction responses
Date: 2013-07-03 13:39:43
Message-ID: 20130703133943.GA5667@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-07-03 17:18:29 +0900, KONDO Mitsumasa wrote:
> Hi,
>
> I tested and changed segsize=0.25GB which is max partitioned table file size and
> default setting is 1GB in configure option (./configure --with-segsize=0.25).
> Because I thought that small segsize is good for fsync phase and background disk
> write in OS in checkpoint. I got significant improvements in DBT-2 result!
>
> * Performance result in DBT-2 (WH340)
> | NOTPM 90%tile Average Maximum
> -----------------------------+---------------------------------------
> original_0.7 (baseline) | 3474.62 18.348328 5.739 36.977713
> fsync + write | 3586.85 14.459486 4.960 27.266958
> fsync + write + segsize=0.25 | 3661.17 8.28816 4.117 17.23191
>
> Changing segsize with my checkpoint patches improved original over 50% at 90%tile
> and maximum response time.

Hm. I wonder how much of this could be gained by doing a
sync_file_range(SYNC_FILE_RANGE_WRITE) (or similar) either while doing
the original checkpoint-pass through the buffers or when fsyncing the
files. Presumably the smaller segsize is better because we don't
completely stall the system by submitting up to 1GB of io at once. So,
if we were to do it in 32MB chunks and then do a final fsync()
afterwards we might get most of the benefits.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-07-03 13:47:13 Re: [9.4 CF 1] The Commitfest Slacker List
Previous Message Robert Haas 2013-07-03 13:31:45 Re: Improvement of checkpoint IO scheduler for stable transaction responses