Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, valgog <valgog(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes
Date: 2010-08-12 05:31:57
Message-ID: 6027.1281591117@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Fujii Masao <masao(dot)fujii(at)gmail(dot)com> writes:
> On Fri, Aug 6, 2010 at 7:50 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> The procedure used does differ from that documented. However, IMHO the
>> procedure *documented* is *not* safe and could lead to corrupt indexes
>> in the way described, since the last recovered point might be mid-way
>> between two halves of an index split record, which will never be
>> corrected during HS.

> An index split record is replayed by two calls of rm_redo()? If not,
> we don't need to worry about the above since the last recovered point
> which pg_last_xlog_replay_location() returns is updated after every
> rm_redo().

Yeah, I thought that was bogus too. If we're following a live master,
the second xlog record should be along shortly, and in any case queries
will give the correct result in between. The problem is only interesting
if the WAL series ends and we have to cons up the split completion by
ourselves; but the logic to do that does exist.

What was bothering me about the procedure is that it's not clear when
the new slave has reached consistency, in the sense of having used WAL
to clean up any out-of-sync conditions in the base backup it was started
from. So you can't be sure when it's okay to begin treating it as a
trustworthy backup or potential master. We track the minimum safe
recovery point for normal PITR recovery cases, but that mechanism isn't
available for slaves cloned according to this procedure. So the DBA is
just flying blind as to whether the slave is trustworthy yet. I can't
prove that that's what burnt the original complainant, but it fits the
symptoms.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Simon Riggs 2010-08-12 07:18:06 Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes
Previous Message Fujii Masao 2010-08-12 05:15:08 Re: Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes