Re: ext4 finally doing the right thing

From: Greg Stark <stark(at)mit(dot)edu>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: ext4 finally doing the right thing
Date: 2010-01-21 05:15:40
Message-ID: 407d949e1001202115k72e98b8eg9b6aebc127319328@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

That doesn't sound right. The kernel having 10% of memory dirty doesn't mean
there's a queue you have to jump at all. You don't get into any queue until
the kernel initiates write-out which will be based on the usage counters --
basically a lru. fsync and cousins like sync_file_range and
posix_fadvise(DONT_NEED) in initiate write-out right away.

How many pending write-out requests for how much data the kernel should keep
active is another question but I imagine it has more to do with storage
hardware than how much memory your system has. And for most hardware it's
probably on the order of megabytes or less.

greg

On 20 Jan 2010 21:19, "Greg Smith" <greg(at)2ndquadrant(dot)com> wrote:

Jeff Davis wrote: > > >> On one side, we might finally be >> able to use
regular drives with their ...
I know they just tweaked this area recently so this may be a bit out of
date, but kernels starting with 2.6.22 allow you to get up to 10% of memory
dirty before getting really aggressive about writing things out, with writes
starting to go heavily at 5%. So even with a 1GB server, you could easily
find 100MB of data sitting in the kernel buffer cache ahead of a database
write that needs to hit disc. Once you start considering the case with
modern hardware, where even my desktop has 8GB of RAM and most serious
servers I see have 32GB, you can easily have gigabytes of such data queued
in front of the write that now needs to hit the platter.

The dream is that a proper barrier implementation will then shuffle your
important write to the front of that queue, without waiting for everything
else to clear first. The exact performance impact depends on how many
non-database writes happen. But even on a dedicated database disk, it
should still help because there are plenty of non-sync'd writes coming out
the background writer via its routine work and the checkpoint writes. And
the ability to fully utilize the write cache on the individual drives, on
commodity hardware, without risking database corruption would make life a
lot easier.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2010-01-21 05:58:13 Re: ext4 finally doing the right thing
Previous Message Robert Haas 2010-01-21 02:03:33 Re: New server to improve performance on our large and busy DB - advice?