Re: Load distributed checkpoint

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Takayuki Tsunakawa <tunakawa(at)soft(dot)fujitsu(dot)com>
Subject: Re: Load distributed checkpoint
Date: 2006-12-07 19:02:33
Message-ID: 45786549.2000602@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Takayuki Tsunakawa wrote:
> Hello, Itagaki-san
>> Checkpoint consists of the following four steps, and the major
>> performance
>> problem is 2nd step. All dirty buffers are written without interval
>> in it.
>> 1. Query information (REDO pointer, next XID etc.)
>> 2. Write dirty pages in buffer pool
>> 3. Flush all modified files
>> 4. Update control file
>
> Hmm. Isn't it possible that step 3 affects the performance greatly?
> I'm sorry if you have already identified step 2 as disturbing
> backends.
>
> As you know, PostgreSQL does not transfer the data to disk when
> write()ing. Actual transfer occurs when fsync()ing at checkpoints,
> unless the filesystem cache runs short. So, disk is overworked at
> fsync()s.

It seems to me that virtual memory settings of the OS will determine
if step 2 or step 3 causes much of the actual disk I/O.

In particular, on Linux, things like /proc/sys/vm/dirty_expire_centisecs
and dirty_writeback_centisecs and possibly dirty_background_ratio
would affect this. If those numbers are high, ISTM most write()s
from step 2 would wait for the flush in step 3. If I understand
correctly, if the dirty_expire_centisecs number is low, most write()s
from step 2 would happen before step 3 because of the pdflush daemons.
I expect other OS's would have different but similar knobs to tune this.

It seems to me that the most portable way postgresql could force
the I/O to be balanced would be to insert otherwise unnecessary
fsync()s into step 2; but that it might (not sure why) be better
to handle this through OS-specific tuning outside of postgres.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2006-12-07 19:32:48 Dead code in _bt_split?
Previous Message Kevin Grittner 2006-12-07 16:25:06 Re: old synchronized scan patch

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2006-12-07 19:32:48 Dead code in _bt_split?
Previous Message Heikki Linnakangas 2006-12-07 19:01:26 Re: Index split WAL reduction