Re: pg_start_backup does not actually allow for consistent, file-level backup

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "'otheus uibk *EXTERN*'" <otheus(dot)uibk(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: pg_start_backup does not actually allow for consistent, file-level backup
Date: 2015-06-08 13:04:11
Message-ID: A737B7A37273E048B164557ADEF4A58B3661DE6B@ntex2010i.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

otheus uibk wrote:
> The manual and in this mailing list, the claim is made that consistent, file-level backups may be made
> by bracketing the file-copy operation with the postgresql pg_start_backup and pg_stop_backup
> operations. Many people including myself have found that in some circumstances, using "tar" to copy
> these files will result in an error if one of the data files changes during the tar operation. The
> responses to those queries on this mailing list are unsatisfactory ("everything is fine, trust us").

Everything is fine, trust us.

>> bash-3.00# tar -cf 16aprilstandby.tar /db-data/
>> tar: Removing leading `/' from member names
>> tar: /db-data/base/24643/151667: file changed as we read it
>> tar: /db-data/base/24643/151764.2: file changed as we read it
>> tar: /db-data/base/24643/151766: file changed as we read it
>> tar: /db-data/base/24643/151768: file changed as we read it
>> tar: /db-data/base/66412/151969: file changed as we read it

> The above scenario is exactly what I saw, albeit with less frequency and severity.

> I decided to test this claim that these messages are "perfectly harmless" and "can be ignored":
[...]
> As you can see below, there were non-zero changes made to these files.
[...]
> Such changes occurred EVEN WHEN TAR DID NOT WARN of changed files. Further, when step 3 involved an
> actual backup, involving minutes, not milliseconds, dozens of differences to files in data/base/...
> are reported. To be clear, I excluded from consideration all files in pg_xlog, pg_clog, pg_subtrans,
> pg_stat_tmp.
>
> If these files are changing during the pg_start_backup() and pg_stop_backup, then exactly what is
> their purpose? Might they be changing during the tar, as tar thinks? How may an operator be assured
> the snapshot is consistent (unless one stops the databases)? Will the redo logs restore the files to
> a consistent state, no matter when these files are changed? I find it hard to believe that would be
> the case.

The files are indeed changing while they are backed up.

The tar archive is not a consistent backup.

Redo using the Write Ahead Log will indeed restore the files to a consistent state,
astonishing as that may be.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2015-06-08 13:06:31 Re: pg_start_backup does not actually allow for consistent, file-level backup
Previous Message Guillaume Lelarge 2015-06-08 13:04:10 Re: pg_start_backup does not actually allow for consistent, file-level backup