Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Date: 2013-11-18 19:38:43
Message-ID: B4C7F01E-5789-493B-A009-D72BF2AED668@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Nov 18, 2013, at 11:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Could you detail how exactly the base backup was created? Including the
> *exact* logic for copying?

0. Before any of this began, P1 was archiving WAL segments to AWS-S3.
1. pg_start_backup('', true) on P1.
2. Using rsync -av on P1, the entire $PGDATA directory was pushed from P1 to S2.
3. Once the rsync was complete, pg_stop_backup() on P1.
4. Create appropriate recovery.conf on S1.
5. Bring up PostgreSQL on S1.
6. PostgreSQL recovers normally (pulling WAL segments from WAL-E), and eventually connects to P1.

> Do you have the log entries for the startup after the base backup?

Sadly, not anymore.

> This server is gone, right?

Correct.

> Could you list the *exact* steps you did to startup the cluster?

0. Before any of this began, P2 was archiving WAL segments to AWS-S3.
1. Initial (empty) data directory deleted on S2.
2. New data directory created with:

/usr/lib/postgresql/9.3/bin/pg_basebackup --verbose --progress --xlog-method=stream --host=<ip> --user=repluser --pgdata=/data/9.3/main

3. Once the pg_basebackup completed, create appropriate recovery.conf on S1.
4. Bring up PostgreSQL on S2.
5. PostgreSQL recovers normally (pulling a small number of WAL segments from WAL-E), and eventually connects to P2.

--
-- Christophe Pettus
xof(at)thebuild(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-11-18 19:47:40 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Previous Message Andres Freund 2013-11-18 19:28:57 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1