Just a followup - thanks to Tom, Kevin, and the rest of the user group, as usual, for their great support.
It appears that I have hit a nice firmware bug, so I am going to have to upgrade the controller firmware. Apparently, it has some issues communicating with the BBU so although the BBU is showing as good, trying to bring it to writeback says it is "below threshold". Since I am going to be taking down 4 servers (our entire infrastructure) to perform the firmware upgrades, I will end up replacing the BBU's for good measure.
Once again, thanks to everyone for their assistance, and I guess this pgstat timeout can be attributed to the controller :)
> -----Original Message-----
> From: pgsql-admin-owner(at)postgresql(dot)org [mailto:pgsql-admin-
> owner(at)postgresql(dot)org] On Behalf Of Benjamin Krajmalnik
> Sent: Wednesday, January 04, 2012 11:48 AM
> To: Kevin Grittner; pgsql-admin
> Subject: Re: [ADMIN] Problem with pgstat timeouts
> Batteries are installed and healthy. I wonder if it did it during one
> of the charge/discharge cycles and just did not revert (interestingly,
> I have one more server that did just that). I am trying to get the
> info from LSI as to what may have caused it, if it is safe to revert
> the cache mode while it is running, and, of course, the exact command
> so I don't foobar things up.
> > -----Original Message-----
> > From: Kevin Grittner [mailto:Kevin(dot)Grittner(at)wicourts(dot)gov]
> > Sent: Wednesday, January 04, 2012 11:44 AM
> > To: pgsql-admin; Benjamin Krajmalnik
> > Subject: Re: [ADMIN] Problem with pgstat timeouts
> > "Benjamin Krajmalnik" <kraj(at)servoyant(dot)com> wrote:
> > > Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache
> > > if Bad BBU
> > > Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write
> > > Cache if Bad BBU
> > > Is it possible that the WriteThrough is what is causing the high
> > > io (and maybe the pgstat wait timeouts as well)?
> > Yes.
> > > If this is the case, would it be safe to change the cache to Write
> > > back?
> > Maybe. How did it get into this (non-default) state? Are batteries
> > installed and healthy?
> > > Additionally, and somewhat unrelated, is there anything special
> > > which I need to do when restarting the primary server vis-à-vis
> > > the streaming replication server? In other words, if I were to
> > > restart the main server, will the streaming replication server
> > > reconnect and pick up once the primary comes online?
> > I think that should be pretty automatic as long as you haven't
> > promoted the standby to be the new master.
> > -Kevin
> Sent via pgsql-admin mailing list (pgsql-admin(at)postgresql(dot)org)
> To make changes to your subscription:
In response to
pgsql-admin by date
|Next:||From: MirrorX||Date: 2012-01-05 14:27:59|
|Subject: Re: system is swapping (not actively), why?|
|Previous:||From: Tim Olvey||Date: 2012-01-04 20:18:45|
|Subject: LDAP SSL authentication no longer supported in 8.4.8|