Re: Postgres WAL Recovery Fails... And Then Works...

From: Phil Monroe <phil(at)identified(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, Infrastructure <infrastructure(at)identified(dot)com>, Vladimir Giverts <vlad(at)identified(dot)com>, Tejas <tejas(at)identified(dot)com>, Sasha Kipervarg <sasha(at)identified(dot)com>
Subject: Re: Postgres WAL Recovery Fails... And Then Works...
Date: 2013-01-15 19:54:49
Message-ID: 50F5B409.9000508@identified.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Sorry, Initial response got blocked since I replied with the logs quoted
again.

>
>
> Also, which version of postgres are you using?

PostgreSQL 9.2.1 on Ubuntu 12.04

>
>
>
> Except in my case no number of restarts helped. You didn't say, were
> you explicitly copying $PGDATA or using some other mechanism to
> migrate the data elsewhere?

So we have a very large database (~5TB), so we use a script to do
parallel rsyncs to copy the data directory
(https://gist.github.com/4477190/#file-pmrcp-rb). The whole copy process
ended up taking ~3.5 hours. So we did a physical copy of $PGDATA (which
is located at /var/lib/postgresql/9.2/main/ on both machines.). We
followed the following process to do this:

1. Master archives WAL files to Backup Host.
2. Execute on Master: psql -c "select pg_start_backup('DATE-slave-restore')"
3. Execute on Master: RCP='rsync -cav --inplace -e rsh'
EXCLUDE='pg_xlog' pmrcp /var/lib/postgresql/9.2/main/
prd-db-01:/var/lib/postgresql/9.2/main/ > /tmp/backup.log
4. Execute on Master: psql -c "select pg_stop_backup()"
5. On Slave, setup recovery.conf to read WAL archive on Backup Host
6. Execute on Slave: pg_ctlcluster 9.2 main start (as described in
initial email)

Best,
Phil

Heikki Linnakangas wrote:
>
> Sorry, Initial response got blocked since I replied with the logs
> quoted again.
>
>>
>>
>> Also, which version of postgres are you using?
>
>
>
> PostgreSQL 9.2.1 on Ubuntu 12.04
>
>>
>>
>>
>> Except in my case no number of restarts helped. You didn't say, were
>> you explicitly copying $PGDATA or using some other mechanism to
>> migrate the data elsewhere?
>
>
>
>
> So we have a very large database (~5TB), so we use a script to do
> parallel rsyncs to copy the data directory
> (https://gist.github.com/4477190/#file-pmrcp-rb). The whole copy
> process ended up taking ~3.5 hours. So we did a physical copy of
> $PGDATA (which is located at /var/lib/postgresql/9.2/main/ on both
> machines.). We followed the following process to do this:
>
> 1. Master archives WAL files to Backup Host.
> 2. Execute on Master: psql -c "select
> pg_start_backup('DATE-slave-restore')"
> 3. Execute on Master: RCP='rsync -cav --inplace -e rsh'
> EXCLUDE='pg_xlog' pmrcp /var/lib/postgresql/9.2/main/
> prd-db-01:/var/lib/postgresql/9.2/main/ > /tmp/backup.log
> 4. Execute on Master: psql -c "select pg_stop_backup()"
> 5. On Slave, setup recovery.conf to read WAL archive on Backup Host
> 6. Execute on Slave: pg_ctlcluster 9.2 main start (as described in
> initial email)
>
>
> Best,
> Phil

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Baptiste LHOSTE 2013-01-17 09:23:39 Autovacuum issues with truncate and create index ...
Previous Message Phil Monroe 2013-01-15 18:33:42 Re: Postgres WAL Recovery Fails... And Then Works...