Re: Infrastructure monitoring

From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: John Hansen <john(at)geeknet(dot)com(dot)au>, pgsql-www(at)postgresql(dot)org, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Subject: Re: Infrastructure monitoring
Date: 2006-01-14 02:16:59
Message-ID: 20060113220930.R28752@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Fri, 13 Jan 2006, Josh Berkus wrote:

> Jim,
>
>> Search has been down for at least 2 days now, and this certainly isn't
>> the first time it's happened. There's also been cases of archives
>> getting stuck, and probably other outages besides those that went on
>> until someone email'd about it.
>>
>> Would it be difficult to setup something to monitor these various
>> services? I know there's at least one OSS tool to do it, though I have
>> no idea how hard it would be to tie that into the current
>> infrastructure.
>
> We have an open offer of Hyperic licenses, and they support FreeBSD now.

Not to discount the offer ... but, what exactly would that provide us? We
already monitor the *servers*, its what is inside of the servers that
needs better monitoring ... knowing nothing about Hyperic, does that
provide something for that?

In the case of the archives, for instance, the problem was a perl process
that for some unknown reason got stuck randomly ... removed that in favor
of an awk script, and it hasn't done it since ... i also redirected cron's
email to scrappy(at)postgresql(dot)org, so that any errors show up in my mailbox
instead of roots, so I get an hourly reminder that things are running well
...

In the case of search ... John would be better at answering that, but when
he and I talked this past week, he mentioned that he was moving it all
over to two new servers, which I changed the DNS for on Wednesday ...

As I've said above ... physical servers are being monitored, so if anyone
has some ideas on how we can improve "content monitoring", for lack of a
better word, I know I'm all ears ...

Again, if Hyperic can offer something for this, let me know ...

----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy(at)hub(dot)org Yahoo!: yscrappy ICQ: 7615664

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Guido Barosio 2006-01-14 02:51:35 Re: Infrastructure monitoring
Previous Message Josh Berkus 2006-01-14 01:14:47 Re: Infrastructure monitoring