Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-17 21:32:45
Message-ID: 20180417213245.GC13097@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:
> On 04/09/2018 12:29 AM, Bruce Momjian wrote:
> >
> > An crazy idea would be to have a daemon that checks the logs and
> > stops Postgres when it seems something wrong.
> >
>
> That doesn't seem like a very practical way. It's better than nothing,
> of course, but I wonder how would that work with containers (where I
> think you may not have access to the kernel log at all). Also, I'm
> pretty sure the messages do change based on kernel version (and possibly
> filesystem) so parsing it reliably seems rather difficult. And we
> probably don't want to PANIC after I/O error on an unrelated device, so
> we'd need to understand which devices are related to PostgreSQL.

Replying to your specific case, I am not sure how we would use a script
to check for I/O errors/space-exhaustion if the postgres user doesn't
have access to it. Does O_DIRECT work in such container cases?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-04-17 21:34:53 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Previous Message David Rowley 2018-04-17 21:30:50 Re: pgsql: Support partition pruning at execution time