Re: pg_dump, pg_dumpall and data durability

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump, pg_dumpall and data durability
Date: 2016-10-14 06:09:16
Message-ID: CAB7nPqRZh178Ld8X+9KfDRA6uN+3jBQx1Z5aW5D+F=6qHp1c6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 13, 2016 at 2:49 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> In my quest of making the backup tools more compliant to data
> durability, here is a thread for pg_dump and pg_dumpall. Here is in a
> couple of lines my proposal:
> - Addition in _archiveHandle of a field to track if the dump generated
> should be synced or not.
> - This is effective for all modes, when the user specifies an output
> file. In short that's when fileSpec is not NULL.
> - Actually do the the sync in _EndData and _EndBlob[s] if appropriate.
> There is for example nothing to do for pg_backup_null.c
> - Addition of --nosync option to allow users to disable it. By default
> it is enabled.
> Note that to make the data durable, the file need to be sync'ed as
> well as its parent folder. So with pg_dump we can only make that
> really durable with -Fd. I think that in the case where the user
> specifies an output file for the other modes we should sync it, that's
> the best we can do. This last statement applies as well for
> pg_dumpall.
>
> Thoughts? I'd like to prepare a patch according to those lines for the next CF.

Okay, here is a patch doing the above. I have added a new --nosync
option to pg_dump and pg_dumpall to switch to the pre-10 behavior. I
have arrived at the conclusion that it is better not to touch at
_EndData and _EndBlob, and just issue the fsync in CloseArchive when
all the write operations are done. In the case of the directory
format, the fsync is done on all the entries recursively. This makes
as well the patch more simple. The regression tests calling pg_dump
don't use --nosync yet in this patch, that's a move that could be done
afterwards.

I have added that to next CF:
https://commitfest.postgresql.org/11/823/
--
Michael

Attachment Content-Type Size
pgdump-sync-v1.patch application/x-download 13.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-10-14 06:24:59 Re: Proposal: scan key push down to heap [WIP]
Previous Message Pavel Stehule 2016-10-14 05:30:57 proposal: session server side variables