Avoiding timeline generation

From: Daniel Farina <daniel(at)heroku(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Avoiding timeline generation
Date: 2011-03-25 01:00:27
Message-ID: AANLkTin63XyarR7eEeUgj4p3sedBBvFauMvBF=dtf1gk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello List,

I have a couple of use cases that are important to me, but my reading
of xlog.c suggests I'm stuck between a rock and a hard place. Or, I am
missing some commonly used pattern -- forgive me in that case. I am
reading 9.0.3 when making these determinations.

Here is the mechanism: I want to author a recovery.conf to perform
some amount of restore_command or streaming replication based
recovery, but I do *not* want to generate a new timeline. Rather, I
want to stay in hot standby mode to allow read-only connections.
Right now, at xlog.c:6346, we have code like this that is run after
zero to many WAL segments have been applied:

/*
* Consider whether we need to assign a new timeline ID.
*
* If we are doing an archive recovery, we always assign a new ID. This
* handles a couple of issues. If we stopped short of the end of WAL
* during recovery, then we are clearly generating a new timeline and must
* assign it a unique new ID. Even if we ran to the end, modifying the
* current last segment is problematic because it may result in trying to
* overwrite an already-archived copy of that segment, and we encourage
* DBAs to make their archive_commands reject that. We can dodge the
* problem by making the new active segment have a new timeline ID.
*
* In a normal crash recovery, we can just extend the timeline we were in.
*/
if (InArchiveRecovery)
{
ThisTimeLineID = findNewestTimeLine(recoveryTargetTLI) + 1;
ereport(LOG,
(errmsg("selected new timeline ID: %u", ThisTimeLineID)));
writeTimeLineHistory(ThisTimeLineID, recoveryTargetTLI,
curFileTLI, endLogId, endLogSeg);
}

InArchiveRecovery gets set to "true" as soon as
readRecoveryCommandFile completes basically normally, and it looks as
though that will ensure we will get a new timeline. If one tries a
bizarre hack, like ensuring the restore_command does not terminate,
one never finishes recovery -- as one may expect -- and one cannot
connect to the server -- which one may not expect is necessarily the
case presuming hot standby, if the server was terminated cleanly.

The things I want to do with the ability to suppress a new timeline:

* Offline WAL application -- I want to be able to bring up a second
server, perform some amount of point in time recovery, and then stop
and archive. It would be nice to support read-only queries in this
case to test the recovered database. The goal of this is to reduce
recovery time in a disaster scenario without tying up resources on a
live server.

* The ability to quiesce a system by bringing it into read-only state
that generates no new WAL while still being able to ship old WAL.
This is useful in switchover scenarios, and is a cousin of cascading
replication support -- whereby a hot standby may also act as a
WAL-relay. The need for cascading is mitigated by having one's own
WAL archiving machinery, but even in absence of that it is highly
desirable to continue to allow read-only access to the database while
guaranteeing no new WAL is generated.

There's also the possibility that someone(s) here may know the
timeline and WAL application code with enough confidence that I can go
in there and do some surgery to un-do a new timeline as a workaround
reliably, knowing that my data directory is fully intact for more
accurate restorations. Or, perhaps bludgeoning the system with
non-writable UNIX permissions may be acceptable.

Please relate your thoughts...

Thanks,

--
fdr

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message YAMAMOTO Takashi 2011-03-25 02:48:07 Re: SSI bug?
Previous Message Jeff Janes 2011-03-24 23:33:33 Re: 2nd Level Buffer Cache