Postgres dies on standby server after triggering failover

From: "Ori Garin" <garin(at)textkernel(dot)nl>
To: pgsql-general(at)postgresql(dot)org
Subject: Postgres dies on standby server after triggering failover
Date: 2008-11-10 09:02:44
Message-ID: 9c64d57f0811100102o7ea2d487u566b3b2192fbae4a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

>
> Hi everyone,
>
> I have a problem with a standby server running on Windows 2003 R2,
> Enterprise x64 edition. I use Postgres 8.3 (installed to C:\Program Files
> (x86)\ )
> Everything was working fine (base backup, archiving, recovery), until I
> wanted to test failover.
> I created the trigger file, the restore_command returned a nonzero exit
> code, but postgres said "could not open file... No such file or directory",
> and died.
> I'm not sure if this has anything to do with running a 32 bit postgres on a
> 64 bit machine, or whether it's a bug. Maybe postgres didn't think that the
> restore command failed?? As the restore_command, I tried using a compiled
> perl script, pg_standby, and even the system copy command. They all give the
> same result.
> I should mention that whenever I start postgres (in recovery mode) from the
> service console, I get a message saying that postgres started and then
> stopped. I figured postgres was in some pending mode, waiting for the
> recovery to end before it can really start (accepting connections). Should I
> start it another way (like pg_ctl) when I'm doing recovery?
>
> Since this is my first time implementing database replication I really
> can't tell what went wrong.
> Any help would be greatly appreciated!
> Ori
>
> ----------
>
> Some dumps.
> Postgres restores files, but gives an error after I created the trigger
> file, and the restore_command returned a nonzero error code:
>
> 2008-10-29 16:39:23 CET LOG: restored log file "0000000100000011000000CF"
> from archive
> 2008-10-29 16:42:25 CET LOG: restored log file "0000000100000011000000D0"
> from archive
> 2008-10-29 16:45:56 CET LOG: restored log file "0000000100000011000000D1"
> from archive
> 2008-10-29 16:49:02 CET LOG: restored log file "0000000100000011000000D2"
> from archive
> 2008-10-29 16:58:45 CET LOG: could not open file
> "pg_xlog/0000000100000011000000D3" (log file 17, segment 211): No such file
> or directory
>
> Now postgres seemed to be stuck, or dead. Using Process Explorer I saw that
> one of the postgres.exe processes was running drwtsn32.exe (Dr Watson
> Postmortem debugger) indefinitely. When I kill drwtsn32, postgres dies too,
> resulting in the following lines in the log (nothing had been logged between
> 16:58:45 and 17:20:27)
>
> 2008-10-29 17:20:27 CET LOG: startup process (PID 2952) was terminated by
> exception 0xC000000D
> 2008-10-29 17:20:27 CET HINT: See C include file "ntstatus.h" for a
> description of the hexadecimal value.
> 2008-10-29 17:20:27 CET LOG: aborting startup due to startup process
> failure
>
> When I looked at the Event Log, I saw an Application Error:
>
> Event Type: Error
> Event Source: Application Error
> Event Category: (100)
> Event ID: 1000
> Date: 29-10-2008
> Time: 16:58:47
> User: N/A
> Computer: S1217
> Description:
> Faulting application postgres.exe, version 8.3.1.876, faulting module
> msvcr80.dll, version 8.0.50727.762, fault address 0x0001e879.
>
> For more information, see Help and Support Center at
> http://go.microsoft.com/fwlink/events.asp.
> Data:
> 0000: 41 70 70 6c 69 63 61 74 Applicat
> 0008: 69 6f 6e 20 46 61 69 6c ion Fail
> 0010: 75 72 65 20 20 70 6f 73 ure pos
> 0018: 74 67 72 65 73 2e 65 78 tgres.ex
> 0020: 65 20 38 2e 33 2e 31 2e e 8.3.1.
> 0028: 38 37 36 20 69 6e 20 6d 876 in m
> 0030: 73 76 63 72 38 30 2e 64 svcr80.d
> 0038: 6c 6c 20 38 2e 30 2e 35 ll 8.0.5
> 0040: 30 37 32 37 2e 37 36 32 0727.762
> 0048: 20 61 74 20 6f 66 66 73 at offs
> 0050: 65 74 20 30 30 30 31 65 et 0001e
> 0058: 38 37 39 879
>
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Dilek Küçük 2008-11-10 14:18:37 max_files_per_process limit
Previous Message Tom Lane 2008-11-07 20:42:20 Re: openssl-fips-1.1.2

Browse pgsql-general by date

  From Date Subject
Next Message Ivan Sergio Borgonovo 2008-11-10 10:43:08 psql exit code
Previous Message Andrus 2008-11-10 09:02:28 Re: How to use index in WHERE int = float