Re: BUG #5452: Server core dumps coming out of recovery mode

From: Chris Copeland <chris(at)cope360(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5452: Server core dumps coming out of recovery mode
Date: 2010-06-15 16:43:16
Message-ID: AANLkTinTNy9HMwTQeoweP6cldpuunVFgW8QlYhfpw3UE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Heikki,

Thanks for your help on this issue.

I modified my restore script to return 1 only once and that solved the
problem.

Cheers,
Chris

On Fri, May 7, 2010 at 3:35 AM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> Chris Copeland wrote:
> > I have two servers with the same hardware, OS, and pg binaries. Log
> files
> > are copied from the master to the standby and the standby is run in
> recovery
> > mode.
> >
> > When the standby is triggered to come out of recovery mode, it fails and
> > generates a core dump. Upon trying to start it after failure, it starts
> > looking for WAL files that it has already recovered.
> >...
> > 2010-05-06 10:57:30 CDT :LOG: restored log file
> "00000001000000AF00000059"
> > from archive
> >>> >> Now I trigger the restore command to return 1 to stop the recovery
> > 2010-05-06 10:59:30 CDT :LOG: could not open file
> > "pg_xlog/00000001000000AF0000005A" (log file 175, segment 90): No such
> file
> > or directory
> > 2010-05-06 10:59:30 CDT :LOG: redo done at AF/59000068
> > 2010-05-06 10:59:30 CDT :PANIC: could not open file
> > "pg_xlog/00000001000000AF00000059" (log file 175, segment 89): No such
> file
> > or directory
>
> At startup, the server needs to re-fetch the last checkpoint record.
> That means calling restore_command again for a file that was already
> restored. It looks like restore_command is failing at the re-fetch,
> which causes the PANIC.
>
> To trigger failover, restore_command needs to return 1, once, but it
> must return 0 again on any subsequent calls. I suspect your
> restore_command keeps returning 1 on the subsequent calls.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2010-06-17 13:23:39 Re: BUG #5502: Preparing an array return Bug
Previous Message Kevin Grittner 2010-06-15 14:07:41 Re: BUG #5507: missing chunk number 0 for toast value XXXXX in pg_toast_XXXXX