Re: Failing start-up archive recovery at Standby mode in PG9.2.4

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Failing start-up archive recovery at Standby mode in PG9.2.4
Date: 2013-04-25 06:13:53
Message-ID: 1366870433934-5753221.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I also had a similar observation when I could reproduce this.
I tried to find why restartpoint causes the recycled segment to be named
after timeline 3, but have not been able to determine that.

When I looked at the source, I found that, the function XLogFileReadAnyTLI
which returns a segment file for reading a XLog page iterates over a list
expectedTLIs which starts with 3 in such a case (that is, in case where this
error happens). XLogFileReadAnyTLI checks the segment in both archive and
pg_xlog. So, even if such a segment (that is with timeline 3) may not be in
the archive , it may be in pg_xlog, due to recycling as we have observed.
So, such a recycled segment may be returned by XLogFileReadAnyTLI as though
it were the next segment to recover from, resulting in the "unexpected
pageaddr ..." error.

I could not understand (in case this error happens) how expectedTLIs list
had 3 at its head (for example, when XLogFileReadAnyTLI used it as we
observed) whereas at other times (when this error does not happen), it has 2
at its head until all the segments of timeline 2 are recovered from?
Also, how does recycled segment gets timeline 3 whereas 2 is expected in
this case?

Is this the right way to look at the problem and its possible fix?

I have tried going through the source regarding this but have not been able
to determine where this could accidentally happen, partly because I do not
understand recovery process (and its code) very well. Will post if find
anything useful.

regards,
Amit Langote

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Failing-start-up-archive-recovery-at-Standby-mode-in-PG9-2-4-tp5753110p5753221.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2013-04-25 06:36:34 Redundancy in comment within lock.c
Previous Message Josh Berkus 2013-04-25 04:15:33 Please add discussion topics for cluster-hackers meeting