Skip site navigation (1) Skip section navigation (2)

Re: BUG #7500: hot-standby replica crash after an initial rsync

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Stuart Bishop <stuart(at)stuartbishop(dot)net>
Subject: Re: BUG #7500: hot-standby replica crash after an initial rsync
Date: 2012-08-29 15:59:06
Message-ID: 201208291759.07170.andres@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-bugs
On Wednesday, August 29, 2012 05:32:31 PM Stuart Bishop wrote:
> I believe I just hit this same issue, but with PG 9.1.3:
> 
> <@:32407> 2012-08-29 10:02:09 UTC LOG:  shutting down
> <@:32407> 2012-08-29 10:02:09 UTC LOG:  database system is shut down
> <[unknown](at)[unknown]:31687> 2012-08-29 13:34:03 UTC LOG:  connection
> received: host=[local]
> <[unknown](at)[unknown]:31687> 2012-08-29 13:34:03 UTC LOG:  incomplete
> startup packet
> <@:31686> 2012-08-29 13:34:03 UTC LOG:  database system was
> interrupted; last known up at 2012-08-29 13:14:47 UTC
> <@:31686> 2012-08-29 13:34:03 UTC LOG:  entering standby mode
> <@:31686> 2012-08-29 13:34:03 UTC LOG:  redo starts at A92/5F000020
> <@:31686> 2012-08-29 13:34:03 UTC FATAL:  could not access status of
> transaction 208177034
> <@:31686> 2012-08-29 13:34:03 UTC DETAIL:  Could not read from file
> "pg_multixact/offsets/0C68" at offset 131072: Success.
> <@:31686> 2012-08-29 13:34:03 UTC CONTEXT:  xlog redo create multixact
> 208177034 offset 1028958730: 1593544329 1593544330
> <@:31681> 2012-08-29 13:34:03 UTC LOG:  startup process (PID 31686)
> exited with exit code 1
> <@:31681> 2012-08-29 13:34:03 UTC LOG:  terminating any other active
> server processes
> 
> This was attempting to rebuild a hot standby after switching my master
> to a new server. In between the shutdown and the attempt to restart:
> 
>  - The master was put into backup mode.
>  - The datadir was rsynced over, using rsync -ahhP --delete-before
> --exclude=postmaster.pid --exclude=pg_xlog
>  - The master was taken out of backup mode.
>  - The pg_xlog directory was emptied
>  - The pg_xlog directory was rsynced across from the master. This
> included all the WAL files from before the promotion, throughout
> backup mode, and a few from after backup mode was left.
Thats not valid, you cannot easily guarantee that youve not copied files that 
were in the progress of being written to. Use a recovery_command if you do not 
want all files to be transferred via the replication connection. But do that 
only for files that have been archived via an archive_command beforehand.

Did you have a backup label in the rsync'ed datadir? In Maxim's case I could 
detect that he had not via line numbers, but I do not see them here...

Greetings,

Andres
-- 
 Andres Freund	                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


In response to

Responses

pgsql-bugs by date

Next:From: lacmaneDate: 2012-08-29 16:08:38
Subject: PostGreSQL pgdac - C++ Builder 2007
Previous:From: Chris TraversDate: 2012-08-29 15:44:12
Subject: Re: BUG #6489: Alter table with composite type/table

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group