Use of O_DIRECT only for open_* sync options

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Use of O_DIRECT only for open_* sync options
Date: 2011-01-19 18:53:14
Message-ID: 201101191853.p0JIrEn15002@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Is there a reason we only use O_DIRECT with open_* sync options?
xlogdefs.h says:

/*
* Because O_DIRECT bypasses the kernel buffers, and because we never
* read those buffers except during crash recovery, it is a win to use
* it in all cases where we sync on each write(). We could allow O_DIRECT
* with fsync(), but because skipping the kernel buffer forces writes out
* quickly, it seems best just to use it for O_SYNC. It is hard to imagine
* how fsync() could be a win for O_DIRECT compared to O_SYNC and O_DIRECT.
* Also, O_DIRECT is never enough to force data to the drives, it merely
* tries to bypass the kernel cache, so we still need O_SYNC or fsync().
*/

This seems wrong because fsync() can win if there are two writes before
the sync call. Can kernels not issue fsync() if the write was O_DIRECT?
If that is the cause, we should document it.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-01-19 18:53:18 Re: [HACKERS] Couple document fixes
Previous Message Tom Lane 2011-01-19 18:48:12 Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql