Re: O_DIRECT setting

From: Neil Conway <neilc(at)samurai(dot)com>
To: Guy Thornley <guy(at)esphion(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: O_DIRECT setting
Date: 2004-09-23 03:58:34
Message-ID: 1095911914.22485.414.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 2004-09-20 at 17:57, Guy Thornley wrote:
> According to the manpage, O_DIRECT implies O_SYNC:
>
> File I/O is done directly to/from user space buffers. The I/O is
> synchronous, i.e., at the completion of the read(2) or write(2)
> system call, data is guaranteed to have been transferred.

This seems like it would be a rather large net loss. PostgreSQL already
structures writes so that the writes we need to hit disk immediately
(WAL records) are fsync()'ed -- the kernel is given more freedom to
schedule how other writes are flushed from the cache. Also, my
recollection is that O_DIRECT also disables readahead -- if that's
correct, that's not what we want either.

BTW, using O_DIRECT has been discussed a few times in the past. Have you
checked the list archives? (for both -performance and -hackers)

> Would people be interested in a performance benchmark?

Sure -- I'd definitely be curious, although as I said I'm skeptical it's
a win.

> I need some benchmark tips :)

Some people have noted that it can be difficult to use contrib/pgbench
to get reproducible results -- you might want to look at Jan's TPC-W
implementation or the OSDL database benchmarks:

http://pgfoundry.org/projects/tpc-w-php/
http://www.osdl.org/lab_activities/kernel_testing/osdl_database_test_suite/

> Incidentally, postgres heap files suffer really, really bad fragmentation,
> which affects sequential scan operations (VACUUM, ANALYZE, REINDEX ...)
> quite drastically. We have in-house patches that somewhat alleiviate this,
> but they are not release quality.

Can you elaborate on these "in-house patches"?

-Neil

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gary Doades 2004-09-23 06:36:11 Re: Caching of Queries
Previous Message Greg Stark 2004-09-23 03:41:21 Re: NAS, SAN or any alternate solution ?