Re: Use of O_DIRECT only for open_* sync options

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Use of O_DIRECT only for open_* sync options
Date: 2011-01-20 02:12:29
Message-ID: AANLkTinz7CtONGSoSCh+dPg0i5a6b3juSkOcvdtoAP6F@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 19, 2011 at 1:53 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Is there a reason we only use O_DIRECT with open_* sync options?
> xlogdefs.h says:
>
> /*
>  *  Because O_DIRECT bypasses the kernel buffers, and because we never
>  *  read those buffers except during crash recovery, it is a win to use
>  *  it in all cases where we sync on each write().  We could allow O_DIRECT
>  *  with fsync(), but because skipping the kernel buffer forces writes out
>  *  quickly, it seems best just to use it for O_SYNC.  It is hard to imagine
>  *  how fsync() could be a win for O_DIRECT compared to O_SYNC and O_DIRECT.
>  *  Also, O_DIRECT is never enough to force data to the drives, it merely
>  *  tries to bypass the kernel cache, so we still need O_SYNC or fsync().
>  */
>
> This seems wrong because fsync() can win if there are two writes before
> the sync call.

Well, the comment does say "...in all cases where we sync on each
write()". But that's certainly not true of WAL, so I dunno.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2011-01-20 02:13:36 REVIEW: "writable CTEs" - doc patch
Previous Message Robert Haas 2011-01-20 02:09:35 Re: psql: Add \dL to show languages