Re: O_DIRECT in freebsd

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Sean Chittenden <sean(at)chittenden(dot)org>, "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: O_DIRECT in freebsd
Date: 2003-06-23 02:43:36
Message-ID: 200306230243.h5N2haa15347@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > Basically, I think we need free-behind rather than O_DIRECT.
>
> There are two separate issues here --- one is what's happening in our
> own cache, and one is what's happening in the kernel disk cache.
> Implementing our own free-behind code would help in our own cache but
> does nothing for the kernel cache.

Right.

> My thought on this is that for large seqscans we could think about
> doing reads through a file descriptor that's opened with O_DIRECT.
> But writes should never go through O_DIRECT. In some scenarios this
> would mean having two FDs open for the same relation file. This'd
> require moderately extensive changes to the smgr-related APIs, but
> it doesn't seem totally out of the question. I'd kinda like to see
> some experimental evidence that it's worth doing though. Anyone
> care to make a quick-hack prototype and do some measurements?

True, it is a cost/benefit issue. My assumption was that once we have
free-behind in the PostgreSQL shared buffer cache, the kernel cache
issues would be minimal, but I am willing to be found wrong.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-06-23 02:45:06 Re: O_DIRECT in freebsd
Previous Message Weiping He 2003-06-23 02:41:42 Re: a problem with index and user define type