Skip site navigation (1) Skip section navigation (2)

Re: silent data loss with ext4 / all current versions

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: silent data loss with ext4 / all current versions
Date: 2016-03-15 14:39:50
Message-ID: CAB7nPqSytVG1o4S3S2pA1O=692ekurJ+fckW2PywEG3sNw54Ow@mail.gmail.com (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
On Thu, Mar 10, 2016 at 4:25 AM, Andres Freund wrote:
> I've finally pushed these, after making a number of mostly cosmetic
> fixes. The only of real consequence is that I've removed the durable_*
> call from the renames to .deleted in xlog[archive].c - these don't need
> to be durable, and are windows only. Oh, and that there was a typo in
> the !HAVE_WORKING_LINK case.
>
> There's a *lot* of version skew here: not-present functionality, moved
> files, different APIs - we got it all.  I've tried to check in each
> version whether we're missing fsyncs for renames and everything.
> Michael, *please* double check the diffs for the different branches.

I have finally been able to spend some time reviewing what you pushed
on back-branches, and things are in correct shape I think. One small
issue that I have is that for EXEC_BACKEND builds, in
write_nondefault_variables we still use one instance of rename(). I
cannot really believe that there are production builds of Postgres
with EXEC_BACKEND on non-Windows platforms, but I think that we had
better cover our backs in this code path. For the other extra 2 calls
of rename() in xlog.c and xlogarchive.c, those are fine untouched I
think there is no need to care about WIN32 blocks...

> Note that we currently have some frontend programs with the equivalent
> problem. Most importantly receivelog.c (pg_basebackup/pg_recveivexlog)
> are missing pretty much the same directory fsyncs.  And at least for
> pg_recvxlog it's critical, especially now that receivexlog support
> syncrep.  I've not done anything about that; there's pretty much no
> chance to share backend code with the frontend in the back-branches.

Yeah, true. We definitely need to do something for that, even for HEAD
it seems like an overkill to have something in for example src/common
to allow frontends to have something if the fix is localized
(pg_rewind may use something else), and it would be nice to finish
wrapping that for the next minor release, so I propose the attached
patches. At the same time, I think that adminpack had better be fixed
as well, so there are actually three patches in this series, things
that I shaped thinking about a backpatch btw, particularly for 0002.
-- 
Michael

Attachment: 0001-Make-rename-calls-for-log-files-in-adminpack-durable.patch
Description: text/x-patch (2.2 KB)
Attachment: 0002-Avoid-potential-data-loss-in-pg_receivexlog.patch
Description: text/x-patch (4.7 KB)
Attachment: 0003-Avoid-potential-lost-rename-of-new-parameter-file-in.patch
Description: text/x-patch (990 bytes)

In response to

Responses

pgsql-hackers by date

Next:From: Thom BrownDate: 2016-03-15 14:41:42
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous:From: David SteeleDate: 2016-03-15 14:28:29
Subject: Re: Fuzzy substring searching with the pg_trgm extension

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group