Re: Tracking latest timeline in standby mode

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking latest timeline in standby mode
Date: 2011-01-04 20:08:18
Message-ID: 4D237E32.2070204@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.11.2010 07:15, Fujii Masao wrote:
> On Mon, Nov 1, 2010 at 8:32 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Yeah, that's one approach. Another is to validate the TLI in the xlog page
>> header, it should always match the current timeline we're on. That would
>> feel more robust to me.
>
> Yeah, that seems better.

I finally got around to look at this. I wrote a patch to validate that
the TLI on xlog page header matches ThisTimeLineID during recovery, and
noticed quickly in testing that it doesn't catch all the cases I'd like
to catch :-(.

The problem scenario is this:

TLI 1 -----------+C-------+------->Standby
.
.
TLI 2 +C-------+------->

The two horizontal lines represent two timelines. TLI 2 forks off from
TLI 1, because of a failover to a not-completely up-to-date standby
server, for example. The plus-signs represent WAL segment boundaries and
C's represent checkpoint records.

Another standby server has replayed all the WAL on TLI 2. Its latest
restartpoint is C. The checkpoint records on the different timelines are
at the same location, at the beginning of the WAL files - not all that
impossible if you have archive_timeout set, for example.

Now, if you stop and restart the standby, it will try to recover to the
latest timeline, which is TLI 2. But before the restart, it had already
replayed the WAL from TLI 1, so it's wrong to replay the WAL from the
parallel universe of TLI 2. At the moment, it will go ahead and do it,
and you end up with an inconsistent database.

I planned to fix that by checking the TLI on the xlog page header, but
that alone isn't enough in the above scenario. The TLI on the page
headers on timeline 2 are what's expected; the first page on the segment
has TLI==1, because it was just forked off from timeline 1, and the
subsequent pages have TLI==2, as they should after the checkpoint record.

So we have to remember that before the restart, which timeline where we
on. We already remember how far we had replayed, that's the
minRecoveryPoint we store in the control file, but we have to memorize
the timeline along that.

On reflection, your idea of checking the history file before replaying
anything seems much easier. We'll still need to add the timeline
alongside minRecoveryPoint to do the checking, but it's a lot easier to
do against the history file. And we can validate the TLIs on page
headers against the information from the history file as we read in the WAL.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2011-01-04 20:25:56 Re: Upgrading Extension, version numbers
Previous Message Dimitri Fontaine 2011-01-04 20:05:08 Re: Upgrading Extension, version numbers