From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Allowing multiple concurrent base backups |
Date: | 2011-03-17 19:39:19 |
Message-ID: | AANLkTikVczSfgFiELCOhyUKouv=GtfuHZYeTC1+YvKJr@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Hmm, good point. It's harmless, but creating the history file in the first
>> place sure seems like a waste of time.
>
> The attached patch changes pg_stop_backup so that it doesn't create
> the backup history file if archiving is not enabled.
>
> When I tested the multiple backups, I found that they can have the same
> checkpoint location and the same history file name.
>
> --------------------
> $ for ((i=0; i<4; i++)); do
> pg_basebackup -D test$i -c fast -x -l test$i &
> done
>
> $ cat test0/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test0
>
> $ cat test1/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test1
>
> $ cat test2/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test2
>
> $ cat test3/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test3
>
> $ ls archive/*.backup
> archive/000000010000000000000002.000000B0.backup
> --------------------
>
> This would cause a serious problem. Because the backup-end record
> which indicates the same "START WAL LOCATION" can be written by the
> first backup before the other finishes. So we might think wrongly that
> we've already reached a consistency state by reading the backup-end
> record (written by the first backup) before reading the last required WAL
> file.
>
> /*
> * Force a CHECKPOINT. Aside from being necessary to prevent torn
> * page problems, this guarantees that two successive backup runs will
> * have different checkpoint positions and hence different history
> * file names, even if nothing happened in between.
> *
> * We use CHECKPOINT_IMMEDIATE only if requested by user (via passing
> * fast = true). Otherwise this can take awhile.
> */
> RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
> (fast ? CHECKPOINT_IMMEDIATE : 0));
>
> This problem happens because the above code (in do_pg_start_backup)
> actually doesn't ensure that the concurrent backups have the different
> checkpoint locations. ISTM that we should change the above or elsewhere
> to ensure that. Or we should include backup label name in the backup-end
> record, to prevent a recovery from reading not-its-own backup-end record.
>
> Thought?
This patch is on the 9.1 open items list, but I don't understand it
well enough to know whether it's correct. Can someone else pick it
up?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Radosław Smogura | 2011-03-17 19:47:03 | 2nd Level Buffer Cache |
Previous Message | Robert Haas | 2011-03-17 18:38:17 | Re: Flex output missing from 9.1a4 tarballs? |