From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Dan Scales <scales(at)vmware(dot)com> |
Subject: | Re: possible new option for wal_sync_method |
Date: | 2012-02-16 18:32:09 |
Message-ID: | 201202161932.09708.andres@anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Thursday, February 16, 2012 06:18:23 PM Dan Scales wrote:
> When running Postgres on a single ext3 filesystem on Linux, we find that
> the attached simple patch gives significant performance benefit (7-8% in
> numbers below). The patch adds a new option for wal_sync_method, which
> is "open_direct". With this option, the WAL is always opened with
> O_DIRECT (but not O_SYNC or O_DSYNC). For Linux, the use of only
> O_DIRECT should be correct. All WAL logs are fully allocated before
> being used, and the WAL buffers are 8K-aligned, so all direct writes are
> guaranteed to complete before returning. (See
> http://lwn.net/Articles/348739/)
I don't think that behaviour is safe in the face of write caches in the IO
path. Linux takes care to issue flush/barrier instructions when necessary if
you issue an fsync/fdatasync, but to my knowledge it does not when O_DIRECT is
used (That would suck performancewise).
I think that behaviour is safe if you have no externally visible write caching
enabled but thats not exactly easy to get/document knowledge.
Why should there otherwise be any performance difference between O_DIRECT|
O_SYNC and O_DIRECT in wal write case? There is no metadata that needs to be
written and I have a hard time imaging that the check whether there is
metadata is that expensive.
I guess a more interesting case would be comparing O_DIRECT|O_SYNC with
O_DIRECT + fdatasync() or even O_DIRECT +
sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE |
SYNC_FILE_RANGE_WAIT_AFTER)
Any special reason youve did that comparison on ext3? Especially with
data=ordered its behaviour regarding syncs is pretty insane performancewise.
Ext4 would be a bit more interesting...
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2012-02-16 18:57:39 | Re: [trivial patch] typo in doc/src/sgml/sepgsql.sgml |
Previous Message | Robert Haas | 2012-02-16 18:29:10 | Re: patch for parallel pg_dump |