From: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |
Date: | 2018-03-29 05:32:43 |
Message-ID: | CAMsr+YFET=hWD7A5ULSpbbiZ6e-N-W7ajga1zZfzy=mko5qfqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 29 March 2018 at 10:48, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:
> On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier <michael(at)paquier(dot)xyz>
> wrote:
> > On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:
> >> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> >>> TL;DR: Pg should PANIC on fsync() EIO return.
> >>
> >> Surely you jest.
> >
> > Any callers of pg_fsync in the backend code are careful enough to check
> > the returned status, sometimes doing retries like in mdsync, so what is
> > proposed here would be a regression.
>
> Craig, is the phenomenon you described the same as the second issue
> "Reporting writeback errors" discussed in this article?
>
> https://lwn.net/Articles/724307/
A variant of it, by the looks.
The problem in our case is that the kernel only tells us about the error
once. It then forgets about it. So yes, that seems like a variant of the
statement:
> "Current kernels might report a writeback error on an fsync() call,
> but there are a number of ways in which that can fail to happen."
>
> That's... I'm speechless.
Yeah.
It's a bit nuts.
I was astonished when I saw the behaviour, and that it appears undocumented.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2018-03-29 05:35:47 | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |
Previous Message | Craig Ringer | 2018-03-29 05:25:51 | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |