On Fri, Jan 29, 2010 at 11:49 PM, Mason Hale <mason(at)onespot(dot)com> wrote:
> While I did not remove the trigger file, I did rename recovery.conf to
> That file contained the recovery_command configuration that identified the
> trigger file. So that rename should have eliminated the problem. But it
> didn't. Even after making this change and taking the trigger file out of the
> equation my database failed to come online.
Renaming of the recovery.conf doesn't resolve the problem at all. Instead,
the sysadmin had to remove only the trigger file with a wrong permission
and just restart postgres.
>> 9.) The server did not come up (again). This time the contents of the
>> new postgresql.log file were:
>> [postgres(at)prod-db-2 pg_log]$ tail -n 100 postgresql-2010-01-18_211132.log
>> 2010-01-18 21:11:32 UTC ()LOG: database system was interrupted while in recovery at log time 2010-01-18 20:10:59 UTC
>> 2010-01-18 21:11:32 UTC ()HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
>> 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/0000000200003C82000000A3" (log file 15490, segment 163): No such file or directory
>> 2010-01-18 21:11:32 UTC ()LOG: invalid primary checkpoint record
>> 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/0000000200003C8200000049" (log file 15490, segment 73): No such file or directory
>> 2010-01-18 21:11:32 UTC ()LOG: invalid secondary checkpoint record
>> 2010-01-18 21:11:32 UTC ()PANIC: could not locate a valid checkpoint record
>> 2010-01-18 21:11:32 UTC ()LOG: startup process (PID 9328) was terminated by signal 6: Aborted
>> 2010-01-18 21:11:32 UTC ()LOG: aborting startup due to startup process failure
You seem to focus on the above trouble. I think that this happened because
recovery.conf was deleted and restore_command was not given. In fact, the
WAL file (e.g., pg_xlog/0000000200003C82000000A3) required for recovery
was unable to be restored from the archive because restore_command was
not supplied. Then recovery failed.
If the sysadmin had left the recovery.conf and removed the trigger file,
pg_standby in restore_command would have restored all WAL files required
for recovery, and recovery would advance well.
Hope this helps.
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
In response to
pgsql-bugs by date
|Next:||From: Jehan-Guillaume (ioguix) de Rorthais||Date: 2010-01-29 16:07:17|
|Subject: BUG #5301: difference of behaviour between 8.3 and 8.4 on IS NULL with sub rows of nulls|
|Previous:||From: Mason Hale||Date: 2010-01-29 14:49:51|
|Subject: Re: unable to fail over to warm standby server|