Re: Postgres WAL Recovery Fails... And Then Works...

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Phil Monroe <phil(at)identified(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, Infrastructure <infrastructure(at)identified(dot)com>, Vladimir Giverts <vlad(at)identified(dot)com>, Tejas <tejas(at)identified(dot)com>, Sasha Kipervarg <sasha(at)identified(dot)com>
Subject: Re: Postgres WAL Recovery Fails... And Then Works...
Date: 2013-01-15 10:51:50
Message-ID: 50F534C6.4000501@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 12.01.2013 04:32, Phil Monroe wrote:
> Hi Everyone,
>
> So we had to failover and do a full base backup to get our slave database back
> online and ran into a interesting scenario. After copying the data directory,
> setting up the recovery.conf, and starting the slave database, the database
> crashes while replaying xlogs. However, trying to start the database again, the
> database is able to replay xlogs farther than it initially got, but ultimately
> ended up failing out again. After starting the DB a third time, PostgreSQL
> replays even further and catches up to the master to start streaming
> replication. Is this common and or acceptable?

How did you perform the base backup? Did you use pg_basebackup? Or if
you did a filesystem-level copy, did you use pg_start/stop_backup
correctly? Did you take the base backup from the master server, or from
another slave?

This looks similar to the bug discussed here:
http://www.postgresql.org/message-id/CAMkU=1wpvYJVEDo6Qvq4QbosZ+AV6BMVCf+XVCG=mJqFRjQ8Pg@mail.gmail.com.
That was fixed in 9.2.2, so if you're using 9.2.1 or 9.2.0, try upgrading.

- Heikki

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Heikki Linnakangas 2013-01-15 10:57:23 Re: corrupted indexes when using base backups generated from hot standby
Previous Message Albe Laurenz 2013-01-15 08:42:32 Re: Casting bytea to varchar