From:
Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To:
Florian Weimer <fw(at)deneb(dot)enyo(dot)de>
Cc:
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>,
Greg Stark <gsstark(at)mit(dot)edu>,
Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:
Re: Re: Faster CREATE DATABASE by delaying fsync
Date:
2010-02-14 20:41:02
Message-ID:
4B785FDE.8040308@mark.mielke.cc (view raw or flat )
Thread:
2009-12-10 20:41:08 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-10 21:56:59 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-10 22:01:08 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-10 22:09:03 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-11 01:38:25 from Nikolas Everett <nik9000(at)gmail(dot)com>
2009-12-11 17:58:39 from "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
2009-12-11 20:43:59 from Nikolas Everett <nik9000(at)gmail(dot)com>
2009-12-11 20:50:10 from "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
2009-12-11 21:39:34 from Nikolas Everett <nik9000(at)gmail(dot)com>
2009-12-11 21:57:56 from Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
2009-12-11 22:12:45 from Scott Carey <scott(at)richrelevance(dot)com>
2009-12-11 22:19:05 from Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
2009-12-11 21:59:43 from Scott Mead <scott(dot)lists(at)enterprisedb(dot)com>
2009-12-11 22:12:47 from Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
2009-12-13 03:56:42 from Robert Haas <robertmhaas(at)gmail(dot)com>
2009-12-11 22:39:54 from Greg Smith <greg(at)2ndquadrant(dot)com>
2009-12-11 22:52:01 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-11 23:59:13 from Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
2009-12-12 00:19:38 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-12 20:36:27 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-12 20:38:41 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-28 22:54:51 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-28 22:59:43 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 18:30:49 from Andres Freund <andres(at)anarazel(dot)de>
2010-01-18 16:35:59 from Greg Stark <gsstark(at)mit(dot)edu>
2010-01-19 14:52:25 from Greg Stark <gsstark(at)mit(dot)edu>
2010-01-19 14:57:14 from Greg Stark <gsstark(at)mit(dot)edu>
2010-01-20 04:13:03 from Andres Freund <andres(at)anarazel(dot)de>
2010-01-20 05:21:07 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-01-27 07:21:44 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-01-19 15:03:16 from Andres Freund <andres(at)anarazel(dot)de>
2010-01-19 15:25:46 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-01-29 18:56:23 from Greg Stark <gsstark(at)mit(dot)edu>
2010-02-02 17:36:12 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-02 17:43:15 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-02 17:50:15 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-02-02 18:14:40 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-02 18:34:07 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-02 19:06:32 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-02 19:08:12 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-02 19:33:30 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-02-02 19:45:46 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-03 11:53:58 from Greg Stark <gsstark(at)mit(dot)edu>
2010-02-03 12:03:04 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-03 13:42:57 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-03 14:19:49 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-06 05:03:30 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-02-06 12:03:50 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-07 05:13:15 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-07 09:23:14 from Greg Smith <greg(at)2ndquadrant(dot)com>
2010-02-07 16:24:00 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-02-07 18:23:10 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-07 18:27:02 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-08 01:31:42 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-08 03:09:01 from Alvaro Herrera <alvherre(at)commandprompt(dot)com>
2010-02-08 04:53:23 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-08 07:13:41 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-08 18:34:01 from Greg Stark <gsstark(at)mit(dot)edu>
2010-02-08 19:29:46 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-11 02:27:30 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-12 15:49:16 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-14 14:03:44 from Greg Stark <gsstark(at)mit(dot)edu>
2010-02-14 15:31:58 from Greg Stark <gsstark(at)mit(dot)edu>
2010-02-14 17:11:39 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-02-14 17:27:00 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-14 17:37:15 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2010-02-14 20:24:24 from Florian Weimer <fw(at)deneb(dot)enyo(dot)de>
2010-02-14 20:41:02 from Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
2010-02-14 20:49:09 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-15 00:08:10 from Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
2010-02-14 20:57:08 from Robert Haas <robertmhaas(at)gmail(dot)com>
2010-02-14 21:43:23 from Andres Freund <andres(at)anarazel(dot)de>
2010-02-14 23:33:54 from Greg Stark <gsstark(at)mit(dot)edu>
2010-01-20 04:02:17 from Andres Freund <andres(at)anarazel(dot)de>
2010-01-20 04:01:55 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-28 23:06:28 from Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
2009-12-28 23:20:35 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 00:46:21 from Greg Smith <greg(at)2ndquadrant(dot)com>
2009-12-29 02:05:39 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 02:53:12 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-29 02:55:37 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 03:04:06 from Michael Clemmons <glassresistor(at)gmail(dot)com>
2009-12-29 03:11:14 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 10:48:10 from Greg Stark <gsstark(at)mit(dot)edu>
2009-12-29 11:13:21 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-28 23:31:56 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 00:27:29 from Greg Stark <gsstark(at)mit(dot)edu>
2009-12-29 00:29:34 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 00:30:17 from david(at)lang(dot)hm
2009-12-29 00:43:15 from Andres Freund <andres(at)anarazel(dot)de>
2009-12-29 00:46:26 from david(at)lang(dot)hm
2009-12-28 23:57:42 from Thomas Kellerer <spam_eater(at)gmx(dot)net>
Lists:
pgsql-hackers pgsql-performance
On 02/14/2010 03:24 PM, Florian Weimer wrote:
> * Tom Lane:
>
>>> Which options would that be? I am not aware that there any for any of the
>>> recent linux filesystems.
>>>
>> Shouldn't journaling of metadata be sufficient?
>>
> You also need to enforce ordering between the directory update and the
> file update. The file metadata is flushed with fsync(), but the
> directory isn't. On some systems, all directory operations are
> synchronous, but not on Linux.
>
dirsync
All directory updates within the filesystem should be
done syn-
chronously. This affects the following system calls:
creat,
link, unlink, symlink, mkdir, rmdir, mknod and rename.
The widely reported problems, though, did not tend to be a problem with
directory changes written too late - but directory changes being written
too early. That is, the directory change is written to disk, but the
file content is not. This is likely because of the "ordered journal"
mode widely used in ext3/ext4 where metadata changes are journalled, but
file pages are not journalled. Therefore, it is important for some
operations, that the file pages are pushed to disk using fsync(file),
before the metadata changes are journalled.
In theory there is some open hole where directory updates need to be
synchronized with file updates, as POSIX doesn't enforce this ordering,
and we can't trust that all file systems implicitly order things
correctly, but in practice, I don't see this sort of problem happening.
If you are concerned, enable dirsync.
Cheers,
mark
--
Mark Mielke<mark(at)mielke(dot)cc>
In response to
Responses
pgsql-performance by date
Next :From: Andres FreundDate: 2010-02-14 20:49:09
Subject : Re: Re: Faster CREATE DATABASE by delaying fsync
Previous :From : Florian WeimerDate : 2010-02-14 20:24:24
Subject : Re: Re: Faster CREATE DATABASE by delaying fsync
pgsql-hackers by date
Next :From: Andres FreundDate: 2010-02-14 20:49:09
Subject : Re: Re: Faster CREATE DATABASE by delaying fsync
Previous :From : Florian WeimerDate : 2010-02-14 20:24:24
Subject : Re: Re: Faster CREATE DATABASE by delaying fsync