Skip site navigation (1) Skip section navigation (2)

Re: ext4 finally doing the right thing

From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, pgsql-performance(at)postgresql(dot)org,Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: ext4 finally doing the right thing
Date: 2010-01-21 13:51:29
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
* Greg Smith <greg(at)2ndquadrant(dot)com> [100121 00:58]:
> Greg Stark wrote:
>> That doesn't sound right. The kernel having 10% of memory dirty  
>> doesn't mean there's a queue you have to jump at all. You don't get  
>> into any queue until the kernel initiates write-out which will be  
>> based on the usage counters -- basically a lru. fsync and cousins like  
>> sync_file_range and posix_fadvise(DONT_NEED) in initiate write-out  
>> right away.
> Most safe ways ext3 knows how to initiate a write-out on something that  
> must go (because it's gotten an fsync on data there) requires flushing  
> every outstanding write to that filesystem along with it.  So as soon as  
> a single WAL write shows up, bam!  The whole cache is emptied (or at  
> least everything associated with that filesystem), and the caller who  
> asked for that little write is stuck waiting for everything to clear  
> before their fsync returns success.

Sure, if your WAL is on the same FS as your data, you're going to get
hit, and *especially* on ext3...

But, I think that's one of the reasons people usually recommend putting
WAL separate.  Even if it's just another partition on the same (set of)
disk(s), you get the benefit of not having to wait for all the dirty
ext3 pages from your whole database FS to be flushed before the WAL write
can complete on it's own FS.


Aidan Van Dyk                                             Create like a god,
aidan(at)highrise(dot)ca                                       command like a king,                                   work like a slave.

In response to


pgsql-performance by date

Next:From: Florian WeimerDate: 2010-01-21 14:04:25
Subject: Re: ext4 finally doing the right thing
Previous:From: Matthew WakelingDate: 2010-01-21 12:13:19
Subject: Re: Inserting 8MB bytea: just 25% of disk perf used?

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group