Skip site navigation (1) Skip section navigation (2)

Re: Allowing multiple concurrent base backups

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing multiple concurrent base backups
Date: 2011-03-18 08:48:09
Message-ID: 4D831C49.6070408@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 17.03.2011 21:39, Robert Haas wrote:
> On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao<masao(dot)fujii(at)gmail(dot)com>  wrote:
>> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com>  wrote:
>>> Hmm, good point. It's harmless, but creating the history file in the first
>>> place sure seems like a waste of time.
>>
>> The attached patch changes pg_stop_backup so that it doesn't create
>> the backup history file if archiving is not enabled.
>>
>> When I tested the multiple backups, I found that they can have the same
>> checkpoint location and the same history file name.
>>
>> --------------------
>> $ for ((i=0; i<4; i++)); do
>> pg_basebackup -D test$i -c fast -x -l test$i&
>> done
>>
>> $ cat test0/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test0
>>
>> $ cat test1/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test1
>>
>> $ cat test2/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test2
>>
>> $ cat test3/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test3
>>
>> $ ls archive/*.backup
>> archive/000000010000000000000002.000000B0.backup
>> --------------------
>>
>> This would cause a serious problem. Because the backup-end record
>> which indicates the same "START WAL LOCATION" can be written by the
>> first backup before the other finishes. So we might think wrongly that
>> we've already reached a consistency state by reading the backup-end
>> record (written by the first backup) before reading the last required WAL
>> file.
>>
>>                 /*
>>                  * Force a CHECKPOINT.  Aside from being necessary to prevent torn
>>                  * page problems, this guarantees that two successive backup runs will
>>                  * have different checkpoint positions and hence different history
>>                  * file names, even if nothing happened in between.
>>                  *
>>                  * We use CHECKPOINT_IMMEDIATE only if requested by user (via passing
>>                  * fast = true).  Otherwise this can take awhile.
>>                  */
>>                 RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
>>                                                   (fast ? CHECKPOINT_IMMEDIATE : 0));
>>
>> This problem happens because the above code (in do_pg_start_backup)
>> actually doesn't ensure that the concurrent backups have the different
>> checkpoint locations. ISTM that we should change the above or elsewhere
>> to ensure that.

Yes, good point.

>> Or we should include backup label name in the backup-end
>> record, to prevent a recovery from reading not-its-own backup-end record.

Backup labels are not guaranteed to be unique either, so including 
backup label in the backup-end-record doesn't solve the problem. But 
something else like a backup-start counter in shared memory or process 
id would work.

It won't make the history file names unique, though. Now that we use on 
the end-of-backup record for detecting end-of-backup, the history files 
are just for documenting purposes. Do we want to give up on history 
files for backups performed with pg_basebackup? Or we can include the 
backup counter or similar in the filename too.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Markus WannerDate: 2011-03-18 09:27:13
Subject: Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous:From: Simon RiggsDate: 2011-03-18 07:52:07
Subject: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group