Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)
Date: 2016-09-10 09:27:45
Message-ID: f8dff810-f5f4-77c3-933b-127df4ed94e5@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/3/16 9:23 AM, Michael Paquier wrote:
> On Sat, Sep 3, 2016 at 10:22 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Sat, Sep 3, 2016 at 10:21 PM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> Oh, well. I have just implemented it on top of the two other patches
>>> for pg_basebackup. For pg_receivexlog, I am wondering if it makes
>>> sense to have it. That would be trivial to implement it, and I think
>>> that we had better make the combination of --synchronous and --nosync
>>> just leave with an error. Thoughts about having that for
>>> pg_receivexlog?
>>
>> With patches that's actually better..
>
> Meh, meh et meh.

In fsync_pgdata(), you have removed the progname from one error
message, even though it is passed into the function. Also, I think
fsync_pgdata() should not be printing initdb's progress messages.
That should stay in initdb. Worse, in 0002 you are then changing the
output, presumably to suit pg_basebackup or something else, which
messes up the initdb output.

durable_rename() does not fsync an existing newfile before renaming,
in contrast to the backend code in fd.c.

I'm slightly concerned about lots of code duplication in
durable_rename, fsync_parent_path between backend and standalone code.
Also, I'm equally slightly concerned that having duplicate function
names will mess up tag search for someone.

This is a preexisting mistake, but fsync_fname_ext() should just be
fsync_fname() to match the backend naming. In the backend,
fsync_fname_ext() "extends" fsync_fname() by accepting an ignore_perm
argument, but the initdb code doesn't do that.

I don't think tar file output in pg_basebackup needs to be fsynced.
It's just a backup file like what pg_dump produces, and we don't fsync
that either. The point of this change is to leave a data directory in
a synced state equivalent to what initdb leaves behind.

Some of the changes in receivelog.c seem excessive. Replacing a plain
fsync() with fsync_fname_ext() plus fsync_parent_path() isn't
justified by the references you have provided. This would need more
explanation.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mattia 2016-09-10 10:18:04 Allow to_date() and to_timestamp() to accept localized month names
Previous Message Michael Paquier 2016-09-10 08:22:08 Re: PATCH: Exclude additional directories in pg_basebackup