Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-18 11:56:57
Message-ID: 20180418115657.GC20040@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 17, 2018 at 02:41:42PM -0700, Andres Freund wrote:
> On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote:
> > On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:
> > > That doesn't seem like a very practical way. It's better than nothing,
> > > of course, but I wonder how would that work with containers (where I
> > > think you may not have access to the kernel log at all). Also, I'm
> > > pretty sure the messages do change based on kernel version (and possibly
> > > filesystem) so parsing it reliably seems rather difficult. And we
> > > probably don't want to PANIC after I/O error on an unrelated device, so
> > > we'd need to understand which devices are related to PostgreSQL.
>
> You can certainly have access to the kernel log in containers. I'd
> assume such a script wouldn't check various system logs but instead tail
> /dev/kmsg or such. Otherwise the variance between installations would be
> too big.

I was thinking 'dmesg', but the result is similar.

> There's not *that* many different type of error messages and they don't
> change that often. If we'd just detect error for the most common FSs
> we'd probably be good. Detecting a few general storage layer message
> wouldn't be that hard either, most things have been unified over the
> last ~8-10 years.

It is hard to know exactly what the message format should be for each
operating system because it is hard to generate them on demand, and we
would need to filter based on Postgres devices.

The other issue is that once you see a message during a checkpoint and
exit, you don't want to see that message again after the problem has
been fixed and the server restarted. The simplest solution is to save
the output of the last check and look for only new entries. I am
attaching a script I run every 15 minutes from cron that emails me any
unexpected kernel messages.

I am thinking we would need a contrib module with sample scripts for
various operating systems.

> > Replying to your specific case, I am not sure how we would use a script
> > to check for I/O errors/space-exhaustion if the postgres user doesn't
> > have access to it.
>
> Not sure what you mean?
>
> Space exhaustiion can be checked when allocating space, FWIW. We'd just
> need to use posix_fallocate et al.

I was asking about cases where permissions prevent viewing of kernel
messages. I think you can view them in containers, but in virtual
machines you might not have access to the host operating system's kernel
messages, and that might be where they are.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

Attachment Content-Type Size
dmesg_check text/plain 574 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2018-04-18 12:45:53 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Previous Message Pavel Stehule 2018-04-18 11:54:44 Re: [HACKERS] proposal: schema variables