Re: Streaming replication status

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-16 17:22:57
Message-ID: 4B51F5F1.2050906@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kevin Grittner wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote:
>> Kevin Grittner wrote:
>
>>> Right, we don't want to give the monitoring software an OS login
>>> for the database servers, for security reasons.
>> depending on what you exactly mean by that I do have to wonder how
>> you monitor more complex stuff (or stuff that require elevated
>> privs) - say raid health, multipath configuration, status of OS
>> level updates, "are certain processes running or not" as well as
>> basic parameters like CPU or IO load. as in stuff you cannot know
>> usless you have it exported through "some" port.
>
> Many of those are monitored on the server one way or another,
> through a hardware card accessible only to the DBAs. The card sends
> an email to the DBAs for any sort of distress, including impending
> or actual drive failure, ambient temperature out of bounds, internal
> or external power out of bounds, etc. OS updates are managed by the
> DBAs through scripts. Ideally we would tie these in to our opcenter
> software, which displays status through hundreds of "LED" boxes on
> big plasma displays in our support areas (and can send emails and
> jabber messages when things get to a bad state), but since the
> messages are getting to the right people in a timely manner, this is
> a low priority as far as monitoring enhancement requests go.

well a lot of people (including myself) consider it a necessity to
aggregate all that stuff in your system monitoring, only that way you
can guarantee proper dependency handling (ie no need to page for
"webserver not running" if the whole server is down).
There is also a case to be made for statistics tracking and long term
monitoring of stuff.

>
> Only the DBAs have OS logins to database servers. Monitoring
> software must deal with application ports (which have to be open
> anyway, so that doesn't add any security risk). Since the hardware
> monitoring doesn't know about file systems, and the disk space on
> database servers is primarily an issue for the database, it made
> sense to us to add the ability to check the space available to the
> database through a database connection. Hence, fsutil.

still seems very backwards - there is much much more than can only be
monitored from within the OS(and not from an external
iLO/RSA/IMM/DRAC/whatever) that you cannot really do from within the
database (or any other application) so I'm still puzzled...

Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-01-16 17:51:37 Re: Hot Standby and handling max_standby_delay
Previous Message Tom Lane 2010-01-16 17:18:54 Re: Archive recovery crashes on win32 in HEAD - hot standby related?