SR standby hangs

From: Andrew Dunstan <amdunstan(at)nc(dot)rr(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: SR standby hangs
Date: 2011-02-18 18:59:07
Message-ID: 4D5EC17B.2060905@nc.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


PostgreSQL Experts Inc has a client with a 9.0.2 streaming replication
server that somehow becomes wedged after running for some time.

The server is running as a warm standby, and the client's application
tries to connect to both the master and the slave, accepting whichever
lets it connect (hence hot standby is not turned on).

Archive files are being shipped as well as WAL streaming.

The symptom is that the recovery process blocks forever on a semaphore.
We've crashed it and got the following backtrace:

#0 0x0000003493ed5337 in semop () from /lib64/libc.so.6
#1 0x00000000005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, interruptOK=1
'\001') at pg_sema.c:420
#2 0x00000000005de645 in LockBufferForCleanup () at bufmgr.c:2432
#3 0x0000000000463733 in heap_xlog_clean (lsn=<value optimized out>,
record=0x1787e1c0) at heapam.c:4168
#4 heap2_redo (lsn=<value optimized out>, record=0x1787e1c0) at heapam.c:4858
#5 0x0000000000488780 in StartupXLOG () at xlog.c:6250
#6 0x000000000048a888 in StartupProcessMain () at xlog.c:9254
#7 0x00000000004a11ef in AuxiliaryProcessMain (argc=2, argv=<value optimized
out>) at bootstrap.c:412
#8 0x00000000005c66c9 in StartChildProcess (type=StartupProcess) at
postmaster.c:4427
#9 0x00000000005c8ab7 in PostmasterMain (argc=1, argv=0x17858bb0) at
postmaster.c:1088
#10 0x00000000005725fe in main (argc=1, argv=<value optimized out>) at main.c:188

The platform is CentOS 5.5 x86-64, kernel version 2.6.18-194.11.4.el5

I'm not quite sure where to start digging. Has anyone else seen
something similar? Our consultant reports having seen a similar problem
elsewhere, at a client who was running hot standby on 9.0.1, but the
problem did not recur, as it does fairly regularly with this client.

cheers

andrew

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-02-18 19:07:25 Re: pgsql: Separate messages for standby replies and hot standby feedback.
Previous Message Tom Lane 2011-02-18 18:57:55 Re: Proposal: collect frequency statistics for arrays