Re: a question about Direct I/O and double buffering

From: Erik Jones <erik(at)myemma(dot)com>
To: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>
Cc: Xiaoning Ding <dingxn(at)cse(dot)ohio-state(dot)edu>, pgsql-performance(at)postgresql(dot)org
Subject: Re: a question about Direct I/O and double buffering
Date: 2007-04-05 18:58:31
Message-ID: 7B476592-606F-46F0-A643-9C4E5D85CE6E@myemma.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Apr 5, 2007, at 1:27 PM, Mark Lewis wrote:
> On Thu, 2007-04-05 at 13:09 -0500, Erik Jones wrote:
>> On Apr 5, 2007, at 12:09 PM, Xiaoning Ding wrote:
>>
>>> Hi,
>>>
>>>
>>> A page may be double buffered in PG's buffer pool and in OS's buffer
>>> cache.
>>> Other DBMS like DB2 and Oracle has provided Direct I/O option to
>>> eliminate
>>> double buffering. I noticed there were discusses on the list. But
>>> I can not find similar option in PG. Does PG support direct I/O now?
>>>
>>>
>>> The tuning guide of PG usually recommends a small shared buffer pool
>>> (compared
>>> to the size of physical memory). I think it is to avoid swapping.
>>> If
>>> there were
>>> swapping, OS kernel may swap out some pages in PG's buffer pool even
>>> PG
>>> want to keep them in memory. i.e. PG would loose full control over
>>> buffer pool.
>>> A large buffer pool is not good because it may
>>> 1. cause more pages double buffered, and thus decrease the
>>> efficiency of
>>> buffer
>>> cache and buffer pool.
>>> 2. may cause swapping.
>>> Am I right?
>>>
>>>
>>> If PG's buffer pool is small compared with physical memory, can I
>>> say
>>> that the
>>> hit ratio of PG's buffer pool is not so meaningful because most
>>> misses
>>> can be
>>> satisfied by OS Kernel's buffer cache?
>>>
>>>
>>> Thanks!
>>
>>
>> To the best of my knowledge, Postgres itself does not have a
>> direct IO
>> option (although it would be a good addition). So, in order to use
>> direct IO with postgres you'll need to consult your filesystem docs
>> for how to set the forcedirectio mount option. I believe it can be
>> set dynamically, but if you want it to be permanent you'll to add it
>> to your fstab/vfstab file.

> Not to hijack this thread, but has anybody here tested the behavior of
> PG on a file system with OS-level caching disabled via
> forcedirectio or
> by using an inherently non-caching file system such as ocfs2?
>
> I've been thinking about trying this setup to avoid double-caching now
> that the 8.x series scales shared buffers better, but I figured I'd
> ask
> first if anybody here had experience with similar configurations.
>
> -- Mark

Rather than repeat everything that was said just last week, I'll
point out that we just had a pretty decent discusson on this last
week that I started, so check the archives. In summary though, if
you have a high io transaction load with a db where the average size
of your "working set" of data doesn't fit in memory with room to
spare, then direct io can be a huge plus, otherwise you probably
won't see much of a difference. I have yet to hear of anybody
actually seeing any degradation in the db performance from it. In
addition, while it doesn't bother me, I'd watch the top posting as
some people get pretty religious about (I moved your comments down).

erik jones <erik(at)myemma(dot)com>
software developer
615-296-0838
emma(r)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dimitri 2007-04-05 19:00:19 Re: Equivalents in PostgreSQL of MySQL's "ENGINE=MEMORY" "MAX_ROWS=1000"
Previous Message Alex Deucher 2007-04-05 18:56:13 Re: a question about Direct I/O and double buffering