Re: [BUG] Archive recovery failure on 9.3+.

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: katsumata(dot)tomonari(at)po(dot)ntts(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUG] Archive recovery failure on 9.3+.
Date: 2014-02-13 17:45:38
Message-ID: 52FD04C2.4060701@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/13/2014 06:47 PM, Heikki Linnakangas wrote:
> On 02/13/2014 02:42 PM, Heikki Linnakangas wrote:
>> The behavior where we prefer a segment from archive with lower TLI over
>> a file with higher TLI in pg_xlog actually changed in commit
>> a068c391ab0. Arguably changing it wasn't a good idea, but the problem
>> your test script demonstrates can be fixed by not archiving the partial
>> segment, with no change to the preference of archive/pg_xlog. As
>> discussed, archiving a partial segment seems like a bad idea anyway, so
>> let's just stop doing that.
>
> After some further thought, while not archiving the partial segment
> fixes your test script, it's not enough to fix all variants of the
> problem. Even if archive recovery doesn't archive the last, partial,
> segment, if the original master server is still running, it's entirely
> possible that it fills the segment and archives it. In that case,
> archive recovery will again prefer the archived segment with lower TLI
> over the segment with newer TLI in pg_xlog.
>
> So I agree we should commit the patch you posted (or something to that
> effect). The change to not archive the last segment still seems like a
> good idea, but perhaps we should only do that in master.

To draw this to conclusion, barring any further insights to this, I'm
going to commit the attached patch to master and REL9_3_STABLE. Please
have a look at the patch, to see if I'm missing something. I modified
the state machine to skip over XLOG_FROM_XLOG state, if reading in
XLOG_FROM_ARCHIVE failed; otherwise you first scan the archive and
pg_xlog together, and then pg_xlog alone, which is pointless.

In master, I'm also going to remove the "archive last segment on old
timeline" code.

- Heikki

Attachment Content-Type Size
0001-Change-the-order-that-pg_xlog-and-WAL-archive-are-po.patch text/x-diff 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-02-13 17:45:55 Re: truncating pg_multixact/members
Previous Message Alvaro Herrera 2014-02-13 17:45:27 Re: truncating pg_multixact/members