fsync method checking

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: pgsql-performance(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: fsync method checking
Date: 2003-12-12 06:49:26
Message-ID: 200312120649.hBC6nQR15608@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Mark Kirkwood wrote:
> This is a well-worn thread title - apologies, but these results seemed
> interesting, and hopefully useful in the quest to get better performance
> on Solaris:
>
> I was curious to see if the rather uninspiring pgbench performance
> obtained from a Sun 280R (see General: ATA Disks and RAID controllers
> for database servers) could be improved if more time was spent
> tuning.
>
> With the help of a fellow workmate who is a bit of a Solaris guy, we
> decided to have a go.
>
> The major performance killer appeared to be mounting the filesystem with
> the logging option. The next most significant seemed to be the choice of
> sync_method for Pg - the default (open_datasync), which we initially
> thought should be the best - appears noticeably slower than fdatasync.

I thought the default was fdatasync, but looking at the code it seems
the default is open_datasync if O_DSYNC is available.

I assume the logic is that we usually do only one write() before
fsync(), so open_datasync should be faster. Why do we not use O_FSYNC
over fsync().

Looking at the code:

#if defined(O_SYNC)
#define OPEN_SYNC_FLAG O_SYNC
#else
#if defined(O_FSYNC)
#define OPEN_SYNC_FLAG O_FSYNC
#endif
#endif

#if defined(OPEN_SYNC_FLAG)
#if defined(O_DSYNC) && (O_DSYNC != OPEN_SYNC_FLAG)
#define OPEN_DATASYNC_FLAG O_DSYNC
#endif
#endif

#if defined(OPEN_DATASYNC_FLAG)
#define DEFAULT_SYNC_METHOD_STR "open_datasync"
#define DEFAULT_SYNC_METHOD SYNC_METHOD_OPEN
#define DEFAULT_SYNC_FLAGBIT OPEN_DATASYNC_FLAG
#else
#if defined(HAVE_FDATASYNC)
#define DEFAULT_SYNC_METHOD_STR "fdatasync"
#define DEFAULT_SYNC_METHOD SYNC_METHOD_FDATASYNC
#define DEFAULT_SYNC_FLAGBIT 0
#else
#define DEFAULT_SYNC_METHOD_STR "fsync"
#define DEFAULT_SYNC_METHOD SYNC_METHOD_FSYNC
#define DEFAULT_SYNC_FLAGBIT 0
#endif
#endif

I think the problem is that we prefer O_DSYNC over fdatasync, but do not
prefer O_FSYNC over fsync.

Running the attached test program shows on BSD/OS 4.3:

write 0.000360
write & fsync 0.001391
write, close & fsync 0.001308
open o_fsync, write 0.000924

showing O_FSYNC faster than fsync().

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Attachment Content-Type Size
unknown_filename text/plain 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Hallgren 2003-12-12 07:30:26 Re: pljava revisited
Previous Message Somasekhar Bangalore 2003-12-12 04:29:23 Re: [GENERAL][HACKERS]data fragmentation

Browse pgsql-performance by date

  From Date Subject
Next Message Tomasz Myrta 2003-12-12 07:46:37 Re: Measuring execution time for sql called from PL/pgSQL
Previous Message Shridhar Daithankar 2003-12-12 06:35:49 Re: Hardware suggestions for Linux/PGSQL server