From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Craig Ringer <craig(at)2ndquadrant(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |
Date: | 2018-04-08 02:33:37 |
Message-ID: | 20180408023337.GA21781@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Apr 8, 2018 at 02:16:07PM +1200, Thomas Munro wrote:
> So, what can we actually do about this new Linux behaviour?
>
> Idea 1:
>
> * whenever you open a file, either tell the checkpointer so it can
> open it too (and wait for it to tell you that it has done so, because
> it's not safe to write() until then), or send it a copy of the file
> descriptor via IPC (since duplicated file descriptors share the same
> f_wb_err)
>
> * if the checkpointer can't take any more file descriptors (how would
> that limit even work in the IPC case?), then it somehow needs to tell
> you that so that you know that you're responsible for fsyncing that
> file yourself, both on close (due to fd cache recycling) and also when
> the checkpointer tells you to
>
> Maybe it could be made to work, but sheesh, that seems horrible. Is
> there some simpler idea along these lines that could make sure that
> fsync() is only ever called on file descriptors that were opened
> before all unflushed writes, or file descriptors cloned from such file
> descriptors?
>
> Idea 2:
>
> Give up, complain that this implementation is defective and
> unworkable, both on POSIX-compliance grounds and on POLA grounds, and
> campaign to get it fixed more fundamentally (actual details left to
> the experts, no point in speculating here, but we've seen a few
> approaches that work on other operating systems including keeping
> buffers dirty and marking the whole filesystem broken/read-only).
>
> Idea 3:
>
> Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.
Idea 4 would be for people to assume their database is corrupt if their
server logs report any I/O error on the file systems Postgres uses.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Christophe Pettus | 2018-04-08 02:37:47 | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |
Previous Message | Thomas Munro | 2018-04-08 02:16:07 | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |