Re: O_DIRECT for WAL writes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: O_DIRECT for WAL writes
Date: 2005-05-30 15:24:54
Message-ID: 13664.1117466694@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> On Mon, 2005-05-30 at 02:52 -0400, Tom Lane wrote:
> Well, that claims that "data is guaranteed to have been transferred",
> but transferred to *where* is the question :)

Oh, I see what you are worried about. I think you are right: what the
doc promises is only that the DMA transfer has finished (ie, it's safe
to scribble on your buffer again). So you'd still need an fsync;
which makes O_DIRECT orthogonal to wal_sync_method rather than a
valid choice for it. (Hm, I wonder if specifying both O_DIRECT and
O_SYNC works ...)

> The other question is whether these semantics are identical among the
> various O_DIRECT implementations (e.g. Linux, FreeBSD, AIX, IRIX, and
> others).

Wouldn't count on it :-(. One thing I'm particularly worried about is
buffer cache consistency: does the kernel guarantee to flush any buffers
it has that overlap the O_DIRECT write operation? Without this, an
application reading the WAL using normal non-O_DIRECT I/O might see the
wrong data; which is bad news for PITR.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-05-30 15:48:09 Re: Interval->day proposal
Previous Message Greg Stark 2005-05-30 15:16:48 Re: Escape handling in COPY, strings, psql

Browse pgsql-patches by date

  From Date Subject
Next Message CD DO DIREITO 2005-05-30 17:18:22 ARQUIVOS JURIDICOS
Previous Message Greg Stark 2005-05-30 15:16:48 Re: Escape handling in COPY, strings, psql