Re: silent data loss with ext4 / all current versions

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: silent data loss with ext4 / all current versions
Date: 2016-03-08 05:55:52
Message-ID: 20160308055552.akvmwjer6km76qqi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-03-08 12:26:34 +0900, Michael Paquier wrote:
> On Tue, Mar 8, 2016 at 12:18 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2016-03-08 12:01:18 +0900, Michael Paquier wrote:
> >> I have spent a couple of hours looking at that in details, and the
> >> patch is neat.
> >
> > Cool. Doing some more polishing right now. Will be back with an updated
> > version soonish.
> >
> > Did you do some testing?
>
> Not much in details yet, I just ran a check-world with fsync enabled
> for the recovery tests, plus quick manual tests with a cluster
> manually set up. I'll do more with your new version now that I know
> there will be one.

Here's my updated version.

Note that I've split the patch into two. One for the infrastructure, and
one for the callsites.

> >> + /* XXX: Add racy file existence check? */
> >> + if (rename(oldfile, newfile) < 0)
> >
> >> I am not sure we should worry about that, what do you think could
> >> cause the old file from going missing all of a sudden. Other backend
> >> processes are not playing with it in the code paths where this routine
> >> is called. Perhaps adding a comment in the header to let users know
> >> that would help?
> >
> > What I'm thinking of is adding a check whether the *target* file already
> > exists, and error out in that case. Just like the link() based path
> > normally does.
>
> Ah, OK. Well, why not. I'd rather have an assertion instead of an error though.

I think it should definitely be an error if anything. But I'd rather
only add it in master...

Andres

Attachment Content-Type Size
0001-Introduce-durable_rename-and-durable_link_or_rename.patch text/x-patch 12.2 KB
0002-Avoid-unlikely-data-loss-scenarios-due-to-rename-wit.patch text/x-patch 12.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2016-03-08 06:07:16 Re: [HACKERS] How can we expand PostgreSQL ecosystem?
Previous Message Michael Paquier 2016-03-08 05:55:03 Re: Recovery test failure for recovery_min_apply_delay on hamster