Re: checkpoint writeback via sync_file_range

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: checkpoint writeback via sync_file_range
Date: 2012-01-13 17:08:51
Message-ID: CAMkU=1yRO-a9i-OoVYe8xj00J_LgO34DXPgnysqT3U3xGbnb_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 12, 2012 at 7:26 PM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
> On 1/11/12 9:25 AM, Andres Freund wrote:
>>
>> The heavy pressure putting it directly in the writeback queue
>> leads to less efficient io because quite often it won't reorder sensibly
>> with
>> other io anymore and thelike. At least that was my experience in using it
>> with
>> in another application.
>
>
> Sure, this is one of the things I was cautioning about in the Double Writes
> thread, with VACUUM being the worst such case I've measured.
>
> The thing to realize here is that the data we're talking about must be
> flushed to disk in the near future.  And Linux will happily cache gigabytes
> of it.  Right now, the database asks for that to be forced to disk via
> fsync, which means in chunks that can be large as a gigabyte.
>
> Let's say we have a traditional storage array and there's competing
> activity.  10MB/s would be a good random I/O write rate in that situation.
>  A single fsync that forces 1GB out at that rate will take *100 seconds*.
>  And I've seen exactly that when trying to--about 80 seconds is my current
> worst checkpoint stall ever.
>
> And we don't have a latency vs. throughput knob any finer than that.  If one
> is added, and you turn it too far toward latency, throughput is going to
> tank for the reasons you've also seen.  Less reordering, elevator sorting,
> and write combining.  If the database isn't going to micro-manage the
> writes, it needs to give the OS room to do that work for it.

Are there any IO benchmarking tools out there that benchmark the
effects of reordering, elevator sorting, write combining, etc.?

What I've seen is basically either "completely sequential" or
"completely random" with not much in between.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-01-13 17:18:40 Re: Disabled features on Hot Standby
Previous Message Robert Haas 2012-01-13 17:08:05 Re: Disabled features on Hot Standby