Monitoring Replication

From: Brandon Phelps <bphelps(at)gls(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Monitoring Replication
Date: 2011-10-12 13:54:36
Message-ID: 4E959C1C.3040002@gls.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello all,

I use Nagios to monitor various things on a few servers and have recently set up a hot-standby server and would obviously like to include the state of streaming replication in my monitoring.

I know about the pg_stat_replication view on the master and the pg_last_xlog_receive_location() system function on the standby... and while there is no traffic I know that the values from the sent_location column from the master view should match the value returned by pg_last_xlog_receive_location on the standby. I also assume that if streaming replication fails completely the pg_stat_replication view on the master should simply return no records... so that should be easy to detect.

The confusion I have is how exactly can I determine just how far behind the replication is during loads? Currently with no traffic (servers not in production yet) sent_location on the master is "A/10018560" and pg_last_xlog_receive_location() on the standby also returns "A/10018560"... How far apart can these be for me to start worrying? I could make a bit more sense of all this if they were simple timestamps or something, but the hex values returned boggle my mind.

Any advice on these issues or other tips on monitoring the replication would be greatly appreciated.

Thanks,
Brandon

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Ribe 2011-10-12 13:54:52 Re: link-spam (was Re: [GENERAL] 7)
Previous Message Chris Travers 2011-10-12 13:53:52 Re: how to key/value iterate in stored function