Quick Links

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc:	Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date:	2018-03-29 05:25:51
Message-ID:	CAMsr+YEa4tv1UCBRQHzA1ycfdvryHFYJ1LhaJJNbjStO3=M9Hg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 29 March 2018 at 13:06, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com>
> wrote:
> > The retries are the source of the problem ; the first fsync() can return
> EIO,
> > and also *clears the error* causing a 2nd fsync (of the same data) to
> return
> > success.
>
> What I'm failing to grok here is how that error flag even matters,
> whether it's a single bit or a counter as described in that patch. If
> write back failed, *the page is still dirty*. So all future calls to
> fsync() need to try to try to flush it again, and (presumably) fail
> again (unless it happens to succeed this time around).
> <http://www.enterprisedb.com>
>

You'd think so. But it doesn't appear to work that way. You can see
yourself with the error device-mapper destination mapped over part of a
volume.

I wrote a test case here.

https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.c

I don't pretend the kernel behaviour is sane. And it's possible I've made
an error in my analysis. But since I've observed this in the wild, and seen
it in a test case, I strongly suspect that's what I've described is just
what's happening, brain-dead or no.

Presumably the kernel marks the page clean when it dispatches it to the I/O
subsystem and doesn't dirty it again on I/O error? I haven't dug that deep
on the kernel side. See the stackoverflow post for details on what I found
in kernel code analysis.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS at 2018-03-29 05:06:22 from Thomas Munro

Responses

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS at 2018-04-21 19:21:39 from Gasper Zejn

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Craig Ringer	2018-03-29 05:32:43	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Previous Message	Tom Lane	2018-03-29 05:10:46	Re: pgsql: Add documentation for the JIT feature.