Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] Direct I/O issues

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [PERFORM] Direct I/O issues
Date: 2006-11-23 16:41:36
Message-ID: 200611231641.kANGfae01113@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patchespgsql-performance
I have applied your test_fsync patch for 8.2.  Thanks.

---------------------------------------------------------------------------

Greg Smith wrote:
> I've been trying to optimize a Linux system where benchmarking suggests 
> large performance differences between the various wal_sync_method options 
> (with o_sync being the big winner).  I started that by using 
> src/tools/fsync/test_fsync to get an idea what I was dealing with (and to 
> spot which drives had write caching turned on).  Since those results 
> didn't match what I was seeing in the benchmarks, I've been browsing the 
> backend source to figure out why.  I noticed test_fsync appears to be, 
> ahem, out of sync with what the engine is doing.
> 
> It looks like V8.1 introduced O_DIRECT writes to the WAL, determined at 
> compile time by a series of preprocessor tests in 
> src/backend/access/transam/xlog.c When O_DIRECT is available, 
> O_SYNC/O_FSYNC/O_DSYNC writes use it.  test_fsync doesn't do that.
> 
> I moved the new code (in 8.2 beta 3, lines 61-92 in xlog.c) into 
> test_fsync; all the flags had the same name so it dropped right in.  You 
> can get the version I made at http://www.westnet.com/~gsmith/test_fsync.c 
> (fixed a compiler warning, too)
> 
> The results I get now look fishy.  I'm not sure if I screwed up a step, or 
> if I'm seeing a real problem.  The system here is running RedHat Linux, 
> RHEL ES 4.0 kernel 2.6.9, and the disk I'm writing to is a standard 
> 7200RPM IDE drive.  I turned off write caching with hdparm -W 0
> 
> Here's an excerpt from the stock test_fsync:
> 
> Compare one o_sync write to two:
>          one 16k o_sync write     8.717944
>          two 8k o_sync writes    17.501980
> 
> Compare file sync methods with 2 8k writes:
>          (o_dsync unavailable)
>          open o_sync, write      17.018495
>          write, fdatasync         8.842473
>          write, fsync,            8.809117
> 
> And here's the version I tried to modify to include O_DIRECT support:
> 
> Compare one o_sync write to two:
>          one 16k o_sync write     0.004995
>          two 8k o_sync writes     0.003027
> 
> Compare file sync methods with 2 8k writes:
>          (o_dsync unavailable)
>          open o_sync, write       0.004978
>          write, fdatasync         8.845498
>          write, fsync,            8.834037
> 
> Obivously the o_sync writes aren't waiting for the disk.  Is this a 
> problem with O_DIRECT under Linux?  Or is my code just not correctly 
> testing this behavior?
> 
> Just as a sanity check, I did try this on another system, running SuSE 
> with drives connected to a cciss SCSI device, and I got exactly the same 
> results.  I'm concerned that Linux users who use O_SYNC because they 
> notice it's faster will be losing their WAL integrity without being aware 
> of the problem, especially as the whole O_DIRECT business isn't even 
> mentioned in the WAL documentation--it really deserves to be brought up in 
> the wal_sync_method notes at 
> http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html
> 
> And while I'm mentioning improvements to that particular documentation 
> page...the wal_buffers notes there are so sparse they misled me initially. 
> They suggest only bumping it up for situations with very large 
> transactions; since I was testing with small ones I left it woefully 
> undersized initially.  I would suggest copying the text from 
> http://developer.postgresql.org/pgdocs/postgres/wal-configuration.html to 
> here: "When full_page_writes is set and the system is very busy, setting 
> this value higher will help smooth response times during the period 
> immediately following each checkpoint."  That seems to match what I found 
> in testing.
> 
> --
> * Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian   bruce(at)momjian(dot)us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Attachment: /rtmp/diff
Description: text/x-diff (1.7 KB)

In response to

Responses

pgsql-performance by date

Next:From: Bruce MomjianDate: 2006-11-23 16:44:25
Subject: Re: Lying drives [Was: Re: Which OS provides the
Previous:From: Simon RiggsDate: 2006-11-23 15:20:35
Subject: Re: PostgreSQL underestimates sorting

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2006-11-23 16:45:42
Subject: Re: Hierarchical Queries status?
Previous:From: Bruce MomjianDate: 2006-11-23 16:30:02
Subject: Re: 8.2 open items list

pgsql-patches by date

Next:From: Tom LaneDate: 2006-11-23 16:45:42
Subject: Re: Direct I/O issues
Previous:From: Brendan JurdDate: 2006-11-23 15:26:22
Subject: Re: ISO week dates

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group