BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases.

From: "Mark" <dvlhntr(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases.
Date: 2011-01-27 01:24:18
Message-ID: 201101270124.p0R1OIpY051049@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5851
Logged by: Mark
Email address: dvlhntr(at)gmail(dot)com
PostgreSQL version: 9.0.2 x86_64
Operating system: CentOS release 5.5 (Final) | 2.6.18-194.17.1.el5 #1 SMP
X86_64
Description: ROHS (read only hot standby) needs to be restarted
manually in somecases.
Details:

getting a break down in streaming rep. my current work around is to restart
the PG instance on the ROHS. doesn't seem to affect the master any. doesn't
require a re-rsync of the base to get replication going again. has happened
with 9.0.2 twice now in a month.

2011-01-26 08:35:42 MST :: (postgres(at)10(dot)80(dot)2(dot)89) LOG: could not receive
data
from client: Connection reset by peer
2011-01-26 08:35:42 MST :: (postgres(at)10(dot)80(dot)2(dot)89) LOG: unexpected EOF on
standby connection

this was all I have in the master's log with the level set to debug 1, I
have reset it to debug 5 and will just wait till it dies again and hopefully
get a better idea of what is going on. nothing is being logged to the
standby. I can't find anything else to grab that shows this break down in
streaming rep that won't start back up.

This is a somewhat *long* distance replication over a 100mbit metro line. we
have had routing issues in the past and see replication fall behind but once
connectivity is restored we see it catch up, without a restart of the
standby.

probably only ships a few gig of changes a day.

these are production machines so I can't do too much playing around to try
and induce "issues"

PostgreSQL 9.0.2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
(1 row)

is this a known issue ? I didn't see anything when I have the mailing list
archive a quick glance search that looked like this.

is there somewhere else I should be looking for more details into why this
is happening ?

I can post the configs if you all want them but nothing special is happening
w/ regards to them.

thank you,

Mark

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Dave Page 2011-01-27 07:39:30 Re: BUG #5800: "corrupted" error messages (encoding problem ?)
Previous Message Josh Berkus 2011-01-26 23:42:18 Re: Multicolun index creation never completes on 9.0.1/solaris