Re: File system level backup of shut down standby does not work?

From: "Antman, Jason (CMG-Atlanta)" <Jason(dot)Antman(at)coxinc(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, "juergen(dot)fuchsberger(at)uni-graz(dot)at" <juergen(dot)fuchsberger(at)uni-graz(dot)at>
Subject: Re: File system level backup of shut down standby does not work?
Date: 2014-02-19 01:14:49
Message-ID: 53040588.9020803@coxinc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Juergen,

I've seen this quite a lot in the past, as we do this multiple times a day.

Here's the procedure we use to prevent it:
1) read the PID from postmaster.pid in the data directory
2) Issue "service postgresql-9.0 stop" (this does a fast shutdown with
-t 600)
3) loop until the PID is no longer running, or a timeout is exceeded (in
which case we error out)
4) the IMPORTANT part: `pg_controldata /path/to/data | grep "Database
cluster state: *shut down"` - if pg_controldata output doesn't include
"shut down" or "shut down in recovery", then something's amiss and the
backup won't be clean (error in shutdown, etc.)
5) `sync`
6) now take the backup

-Jason

On 02/17/2014 08:32 AM, Jürgen Fuchsberger wrote:
> Hi all,
>
> I have a master-slave configuration running the master with WAL
> archiving enabled and the slave in recovery mode reading back the WAL
> files from the master ("Log-shipping standby" as described in
> http://www.postgresql.org/docs/9.1/static/warm-standby.html)
>
> I take frequent backups of the standby server:
>
> 1) Stop standby server (fast shutdown).
> 2) Rsync to another fileserver
> 3) Start standby server.
>
> I just tried to recover one of these backups which *failed* with the
> following errors:
>
> 2014-02-17 14:27:28 CET LOG: incomplete startup packet
> 2014-02-17 14:27:28 CET LOG: database system was shut down in recovery
> at 2013-12-25 18:00:03 CET
> 2014-02-17 14:27:28 CET LOG: could not open file
> "pg_xlog/00000001000001E300000061" (log file 483, segment 97): No such
> file or directory
> 2014-02-17 14:27:28 CET LOG: invalid primary checkpoint record
> 2014-02-17 14:27:28 CET LOG: could not open file
> "pg_xlog/00000001000001E300000060" (log file 483, segment 96): No such
> file or directory
> 2014-02-17 14:27:28 CET LOG: invalid secondary checkpoint record
> 2014-02-17 14:27:28 CET PANIC: could not locate a valid checkpoint record
> 2014-02-17 14:27:29 CET FATAL: the database system is starting up
> 2014-02-17 14:27:29 CET FATAL: the database system is starting up
> 2014-02-17 14:27:30 CET FATAL: the database system is starting up
> 2014-02-17 14:27:30 CET FATAL: the database system is starting up
> 2014-02-17 14:27:31 CET FATAL: the database system is starting up
> 2014-02-17 14:27:31 CET FATAL: the database system is starting up
> 2014-02-17 14:27:32 CET FATAL: the database system is starting up
> 2014-02-17 14:27:33 CET FATAL: the database system is starting up
> 2014-02-17 14:27:33 CET FATAL: the database system is starting up
> 2014-02-17 14:27:33 CET LOG: startup process (PID 26186) was terminated
> by signal 6: Aborted
> 2014-02-17 14:27:33 CET LOG: aborting startup due to startup process
> failure
>
>
> So it seems the server is missing some WAL files which are not
> in the backup? Or is it simply not possible to take a backup of a
> standby server in recovery?
>
> Best,
> Juergen
>
>
>

--

Jason Antman | Systems Engineer | CMGdigital
jason(dot)antman(at)coxinc(dot)com | p: 678-645-4155

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jürgen Fuchsberger 2014-02-19 07:30:05 Re: File system level backup of shut down standby does not work?
Previous Message Reece Hart 2014-02-19 01:08:43 plans for plpython in RDS?