Re: standby waiting for what?

From: Ray Stell <stellr(at)cns(dot)vt(dot)edu>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: standby waiting for what?
Date: 2009-03-06 16:19:08
Message-ID: 20090306161908.GA27527@cns.vt.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Mar 04, 2009 at 03:14:51PM -0500, Ray Stell wrote:
> On Wed, Mar 04, 2009 at 03:06:12PM -0500, Ray Stell wrote:
> > Testing pg_standby in 8.3.6. I've gotten this standby into some sort of
> > bind. It seems like it may be waiting for some WAL. How can I tell
> > what it is waiting on? I don't really know how this works, so I may
>
>
> say something silly. The standby log says:
>
> ,2512,,2009-03-04 12:23:01.483 EST,49aeb8f5.9d0,1,2009-03-04 12:23:01 EST,0, LOG: database system was interrupted; last known up at 2009-03-04 12:20:29 EST
> ,2512,,2009-03-04 12:23:01.483 EST,49aeb8f5.9d0,2,2009-03-04 12:23:01 EST,0, LOG: starting archive recovery
> ,2512,,2009-03-04 12:23:01.484 EST,49aeb8f5.9d0,3,2009-03-04 12:23:01 EST,0, LOG: restore_command = '/usr/local/pgsql/bin/pg_standby /data/pgsql/wals/alerts_oamp %f %p %r >> /home/postgresql/log/alerts_oamp/recovery.log'
>
>
> alerts_oamp]$ cat postmaster.pid
> 2510
> /data/pgsql/alerts_oamp
> 5498001 4194312
>
> alerts_oamp]$ ps -ef | grep 1005
> 1005 903 901 0 10:10 ? 00:00:00 sshd: postgresql(at)pts/0
> 1005 904 903 0 10:10 pts/0 00:00:00 -bash
> 1005 1016 1013 0 10:21 ? 00:00:00 sshd: postgresql(at)pts/1
> 1005 1017 1016 0 10:21 pts/1 00:00:00 -bash
> 1005 2510 1 0 12:23 pts/0 00:00:00 /usr/local/pgsql836/bin/postgres -D /data/pgsql/alerts_oamp
> 1005 2511 2510 0 12:23 ? 00:00:00 postgres: logger process
> 1005 2512 2510 0 12:23 ? 00:00:00 postgres: startup process
> 1005 2520 2512 0 12:23 ? 00:00:00 sh -c /usr/local/pgsql/bin/pg_standby /data/pgsql/wals/alerts_oamp 00000002000000000000001C.00512178.backup pg_xlog/RECOVERYHISTORY 000000000000000000000000 >> /home/postgresql/log/alerts_oamp/recovery.log
> 1005 2521 2520 0 12:23 ? 00:00:00 /usr/local/pgsql/bin/pg_standby /data/pgsql/wals/alerts_oamp 00000002000000000000001C.00512178.backup pg_xlog/RECOVERYHISTORY 000000000000000000000000
> 1005 2615 1017 0 12:27 pts/1 00:00:00 tail -f alerts_oamp-2009-03-04_122301.log
> 1005 3271 904 0 15:11 pts/0 00:00:00 ps -ef
> 1005 3272 904 0 15:11 pts/0 00:00:00 grep 1005
>
> alerts_oamp]$ ls -l /data/pgsql/wals/alerts_oamp/
> total 114828
> -rw------- 1 postgresql postgresql 16777216 Mar 4 11:28 00000002000000000000001A
> -rw------- 1 postgresql postgresql 16777216 Mar 4 11:29 00000002000000000000001B
> -rw------- 1 postgresql postgresql 16777216 Mar 4 12:24 00000002000000000000001C
> -rw------- 1 postgresql postgresql 16777216 Mar 4 12:25 00000002000000000000001D
> -rw------- 1 postgresql postgresql 16777216 Mar 4 12:26 00000002000000000000001E
> -rw------- 1 postgresql postgresql 16777216 Mar 4 14:45 00000002000000000000001F
> -rw------- 1 postgresql postgresql 16777216 Mar 4 14:45 000000020000000000000020
>
> any ideas what this guy is hurt by?

I stubbled into the source of the problem. I hope somebody who knows the code can explain.
I decided to bounce the primary just to see if it would make a difference in the standby.
The primary would not restart:

,3095,,2009-03-06 10:34:01.910 EST,49b14269.c17,2,2009-03-06 10:34:01 EST,0, LOG: could not open file "pg_xlog/00000002000000000000001C" (log file 0, segment 28): No such file or directory
,3095,,2009-03-06 10:34:01.910 EST,49b14269.c17,3,2009-03-06 10:34:01 EST,0, LOG: invalid checkpoint record
,3095,,2009-03-06 10:34:01.910 EST,49b14269.c17,4,2009-03-06 10:34:01 EST,0, PANIC: could not locate required checkpoint record
,3095,,2009-03-06 10:34:01.910 EST,49b14269.c17,5,2009-03-06 10:34:01 EST,0, HINT: If you are not restoring from a backup, try removing the file "/data/pgsql/alerts_oamp/backup_label".
,3093,,2009-03-06 10:34:01.910 EST,49b14269.c15,1,2009-03-06 10:34:01 EST,0, LOG: startup process (PID 3095) was terminated by signal 6: Aborted

So, I removed that file and restarted. Rebuilt the standby and all is well. So, why did that file muck up the standby and
change the value pg was passing to pg_standby?

Thanks, looking forward to 8.4!

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Jakov Sosic 2009-03-06 21:21:02 Re: using pgdg repo
Previous Message Carol Walter 2009-03-06 16:16:31 Re: Default text_serach_config