Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Greg Stark <stark(at)mit(dot)edu>
To: Anthony Iliopoulos <ailiop(at)altatus(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-09 08:45:40
Message-ID: CAM-w4HN9=1GGcp8EToX5HjEQ8nXhJqVzoRx+i+f1rjpmkEq2PQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 April 2018 at 22:47, Anthony Iliopoulos <ailiop(at)altatus(dot)com> wrote:
> On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:
>> On 8 April 2018 at 04:27, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> > On 8 April 2018 at 10:16, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
>
> The question is, what should the kernel and application do in cases
> where this is simply not possible (according to freebsd that keeps
> dirty pages around after failure, for example, -EIO from the block
> layer is a contract for unrecoverable errors so it is pointless to
> keep them dirty). You'd need a specialized interface to clear-out
> the errors (and drop the dirty pages), or potentially just remount
> the filesystem.

Well firstly that's not necessarily the question. ENOSPC is not an
unrecoverable error. And even unrecoverable errors for a single write
doesn't mean the write will never be able to succeed in the future.
But secondly doesn't such an interface already exist? When the device
is dropped any dirty pages already get dropped with it. What's the
point in dropping them but keeping the failing device?

But just to underline the point. "pointless to keep them dirty" is
exactly backwards from the application's point of view. If the error
writing to persistent media really is unrecoverable then it's all the
more critical that the pages be kept so the data can be copied to some
other device. The last thing user space expects to happen is if the
data can't be written to persistent storage then also immediately
delete it from RAM. (And the *really* last thing user space expects is
for this to happen and return no error.)

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-04-09 08:49:09 Re: Verbosity of genbki.pl
Previous Message Kyotaro HORIGUCHI 2018-04-09 08:13:06 Re: Problem while setting the fpw with SIGHUP