Quick Links

Re: Allowing multiple concurrent base backups

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Allowing multiple concurrent base backups
Date:	2011-03-17 19:39:19
Message-ID:	AANLkTikVczSfgFiELCOhyUKouv=GtfuHZYeTC1+YvKJr@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Hmm, good point. It's harmless, but creating the history file in the first
>> place sure seems like a waste of time.
>
> The attached patch changes pg_stop_backup so that it doesn't create
> the backup history file if archiving is not enabled.
>
> When I tested the multiple backups, I found that they can have the same
> checkpoint location and the same history file name.
>
> --------------------
> $ for ((i=0; i<4; i++)); do
> pg_basebackup -D test$i -c fast -x -l test$i &
> done
>
> $ cat test0/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test0
>
> $ cat test1/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test1
>
> $ cat test2/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test2
>
> $ cat test3/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test3
>
> $ ls archive/*.backup
> archive/000000010000000000000002.000000B0.backup
> --------------------
>
> This would cause a serious problem. Because the backup-end record
> which indicates the same "START WAL LOCATION" can be written by the
> first backup before the other finishes. So we might think wrongly that
> we've already reached a consistency state by reading the backup-end
> record (written by the first backup) before reading the last required WAL
> file.
>
> /*
> * Force a CHECKPOINT. Aside from being necessary to prevent torn
> * page problems, this guarantees that two successive backup runs will
> * have different checkpoint positions and hence different history
> * file names, even if nothing happened in between.
> *
> * We use CHECKPOINT_IMMEDIATE only if requested by user (via passing
> * fast = true). Otherwise this can take awhile.
> */
> RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
> (fast ? CHECKPOINT_IMMEDIATE : 0));
>
> This problem happens because the above code (in do_pg_start_backup)
> actually doesn't ensure that the concurrent backups have the different
> checkpoint locations. ISTM that we should change the above or elsewhere
> to ensure that. Or we should include backup label name in the backup-end
> record, to prevent a recovery from reading not-its-own backup-end record.
>
> Thought?

This patch is on the 9.1 open items list, but I don't understand it
well enough to know whether it's correct. Can someone else pick it
up?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Allowing multiple concurrent base backups at 2011-02-01 03:45:27 from Fujii Masao

Responses

Re: Allowing multiple concurrent base backups at 2011-03-18 08:48:09 from Heikki Linnakangas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Radosław Smogura	2011-03-17 19:47:03	2nd Level Buffer Cache
Previous Message	Robert Haas	2011-03-17 18:38:17	Re: Flex output missing from 9.1a4 tarballs?