Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Anthony Iliopoulos <ailiop(at)altatus(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Catalin Iacob <iacobcatalin(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, ailiop(at)altatus(dot)com
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-03-31 13:24:28
Message-ID: 20180331132428.GA31905@technoir
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:

> >> Yeah, I see why you want to PANIC.
> >
> > Indeed. Even doing that leaves question marks about all the kernel
> > versions before v4.13, which at this point is pretty much everything
> > out there, not even detecting this reliably. This is messy.

There may still be a way to reliably detect this on older kernel
versions from userspace, but it will be messy whatsoever. On EIO
errors, the kernel will not restore the dirty page flags, but it
will flip the error flags on the failed pages. One could mmap()
the file in question, obtain the PFNs (via /proc/pid/pagemap)
and enumerate those to match the ones with the error flag switched
on (via /proc/kpageflags). This could serve at least as a detection
mechanism, but one could also further use this info to logically
map the pages that failed IO back to the original file offsets,
and potentially retry IO just for those file ranges that cover
the failed pages. Just an idea, not tested.

Best regards,
Anthony

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-03-31 13:32:33 Re: WIP: a way forward on bootstrap data
Previous Message Amit Kapila 2018-03-31 13:21:04 Re: hot_standby_feedback vs excludeVacuum and snapshots