Quick Links

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date:	2018-03-29 13:15:10
Message-ID:	CAMsr+YH8JP-UdsGt0dLMcDRx6WQ78BZA7kMgimu8+ZuB_uzyFQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 29 March 2018 at 20:07, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>
> wrote:
> > On 28 March 2018 at 11:53, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>
> >> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> >> > TL;DR: Pg should PANIC on fsync() EIO return.
> >>
> >> Surely you jest.
> >
> > No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC
> as
> > well to avoid similar lost-page-write issues.
>
> I found your discussion with kernel hacker Jeff Layton at
> https://lwn.net/Articles/718734/ in which he said: "The stackoverflow
> writeup seems to want a scheme where pages stay dirty after a
> writeback failure so that we can try to fsync them again. Note that
> that has never been the case in Linux after hard writeback failures,
> AFAIK, so programs should definitely not assume that behavior."
>
> The article above that says the same thing a couple of different ways,
> ie that writeback failure leaves you with pages that are neither
> written to disk successfully nor marked dirty.
>
> If I'm reading various articles correctly, the situation was even
> worse before his errseq_t stuff landed. That fixed cases of
> completely unreported writeback failures due to sharing of PG_error
> for both writeback and read errors with certain filesystems, but it
> doesn't address the clean pages problem.
>
> Yeah, I see why you want to PANIC.
>

In more ways than one ;)

> I'm not seeking to defend what the kernel seems to be doing. Rather,
> saying
> > that we might see similar behaviour on other platforms, crazy or not. I
> > haven't looked past linux yet, though.
>
> I see no reason to think that any other operating system would behave
> that way without strong evidence... This is openly acknowledged to be
> "a mess" and "a surprise" in the Filesystem Summit article. I am not
> really qualified to comment, but from a cursory glance at FreeBSD's
> vfs_bio.c I think it's doing what you'd hope for... see the code near
> the comment "Failed write, redirty."

Ok, that's reassuring, but doesn't help us on the platform the great
majority of users deploy on :(

"If on Linux, PANIC"

Hrm.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS at 2018-03-29 12:07:56 from Thomas Munro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2018-03-29 13:26:50	Re: Parallel safety of binary_upgrade_create_empty_extension
Previous Message	Alexander Korotkov	2018-03-29 13:10:24	Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)