From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Allowing multiple concurrent base backups |
Date: | 2011-03-18 08:48:09 |
Message-ID: | 4D831C49.6070408@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 17.03.2011 21:39, Robert Haas wrote:
> On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao<masao(dot)fujii(at)gmail(dot)com> wrote:
>> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> Hmm, good point. It's harmless, but creating the history file in the first
>>> place sure seems like a waste of time.
>>
>> The attached patch changes pg_stop_backup so that it doesn't create
>> the backup history file if archiving is not enabled.
>>
>> When I tested the multiple backups, I found that they can have the same
>> checkpoint location and the same history file name.
>>
>> --------------------
>> $ for ((i=0; i<4; i++)); do
>> pg_basebackup -D test$i -c fast -x -l test$i&
>> done
>>
>> $ cat test0/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test0
>>
>> $ cat test1/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test1
>>
>> $ cat test2/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test2
>>
>> $ cat test3/backup_label
>> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
>> CHECKPOINT LOCATION: 0/20000E8
>> START TIME: 2011-02-01 12:12:31 JST
>> LABEL: test3
>>
>> $ ls archive/*.backup
>> archive/000000010000000000000002.000000B0.backup
>> --------------------
>>
>> This would cause a serious problem. Because the backup-end record
>> which indicates the same "START WAL LOCATION" can be written by the
>> first backup before the other finishes. So we might think wrongly that
>> we've already reached a consistency state by reading the backup-end
>> record (written by the first backup) before reading the last required WAL
>> file.
>>
>> /*
>> * Force a CHECKPOINT. Aside from being necessary to prevent torn
>> * page problems, this guarantees that two successive backup runs will
>> * have different checkpoint positions and hence different history
>> * file names, even if nothing happened in between.
>> *
>> * We use CHECKPOINT_IMMEDIATE only if requested by user (via passing
>> * fast = true). Otherwise this can take awhile.
>> */
>> RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
>> (fast ? CHECKPOINT_IMMEDIATE : 0));
>>
>> This problem happens because the above code (in do_pg_start_backup)
>> actually doesn't ensure that the concurrent backups have the different
>> checkpoint locations. ISTM that we should change the above or elsewhere
>> to ensure that.
Yes, good point.
>> Or we should include backup label name in the backup-end
>> record, to prevent a recovery from reading not-its-own backup-end record.
Backup labels are not guaranteed to be unique either, so including
backup label in the backup-end-record doesn't solve the problem. But
something else like a backup-start counter in shared memory or process
id would work.
It won't make the history file names unique, though. Now that we use on
the end-of-backup record for detecting end-of-backup, the history files
are just for documenting purposes. Do we want to give up on history
files for backups performed with pg_basebackup? Or we can include the
backup counter or similar in the filename too.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Wanner | 2011-03-18 09:27:13 | Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication. |
Previous Message | Simon Riggs | 2011-03-18 07:52:07 | Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication. |