Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Gasper Zejn <zejn(at)owca(dot)info>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Craig Ringer <craig(at)2ndQuadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-09 18:02:21
Message-ID: 75bfe2e2-90b0-d411-56b3-14d440c6b5b0@owca.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09. 04. 2018 15:42, Tomas Vondra wrote:
> On 04/09/2018 12:29 AM, Bruce Momjian wrote:
>> An crazy idea would be to have a daemon that checks the logs and
>> stops Postgres when it seems something wrong.
>>
> That doesn't seem like a very practical way. It's better than nothing,
> of course, but I wonder how would that work with containers (where I
> think you may not have access to the kernel log at all). Also, I'm
> pretty sure the messages do change based on kernel version (and possibly
> filesystem) so parsing it reliably seems rather difficult. And we
> probably don't want to PANIC after I/O error on an unrelated device, so
> we'd need to understand which devices are related to PostgreSQL.
>
> regards
>

For a bit less (or more) crazy idea, I'd imagine creating a Linux kernel
module with kprobe/kretprobe capturing the file passed to fsync or even
byte range within file and corresponding return value shouldn't be that
hard. Kprobe has been a part of Linux kernel for a really long time, and
from first glance it seems like it could be backported to 2.6 too.

Then you could have stable log messages or implement some kind of "fsync
error log notification" via whatever is the most sane way to get this
out of kernel.

If the kernel is new enough and has eBPF support (seems like >=4.4),
using bcc-tools[1] should enable you to write a quick script to get
exactly that info via perf events[2].

Obviously, that's a stopgap solution ...

Kind regards,
Gasper

[1] https://github.com/iovisor/bcc
[2]
https://blog.yadutaf.fr/2016/03/30/turn-any-syscall-into-event-introducing-ebpf-kernel-probes/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2018-04-09 18:04:46 Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Previous Message John Naylor 2018-04-09 17:41:06 Re: Documentation for bootstrap data conversion