From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Potential data loss of 2PC files |
Date: | 2017-01-31 02:07:10 |
Message-ID: | CAB7nPqSpq+GUZV6kFFa9hpKpakznBpB+YR+Q9BKbvX6Xd6vvJw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 30, 2017 at 10:52 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> So, if I understood correctly, the problem scenario is:
>
> 1. Create and write to a file.
> 2. fsync() the file.
> 3. Crash.
> 4. After restart, the file is gone.
Yes, that's a problem with fsync's durability, and we need to achieve
that at checkpoint. I find [1] a good read on the matter. That's
easier to decrypt than [2] or [3] in the POSIX spec..
[1]: http://blog.httrack.com/blog/2013/11/15/everything-you-always-wanted-to-know-about-fsync/
[2]: http://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html
[3]: http://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html
> If that can happen, don't we have the same problem in many other places?
> Like, all the SLRUs? They don't fsync the directory either.
Right, pg_commit_ts and pg_clog enter in this category.
> Is unlink() guaranteed to be durable, without fsyncing the directory? If
> not, then we need to fsync() the directory even if there are no files in it
> at the moment, because some might've been removed earlier in the checkpoint
> cycle.
Hm... I am not an expert in file systems. At least on ext4 I can see
that unlink() is atomic, but not durable. So if an unlink() is
followed by a power failure, the previously unlinked file could be
here if the parent directory is not fsync'd.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-01-31 02:07:57 | Re: Potential data loss of 2PC files |
Previous Message | Claudio Freire | 2017-01-31 02:05:28 | Re: Vacuum: allow usage of more than 1GB of work mem |