Re: Streaming replication status

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-15 06:53:18
Message-ID: 4B5010DE.50802@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith wrote:
> Fujii Masao wrote:
>>> "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns
>>> one row per standby servers, showing pid of walsender, host name/
>>> port number/user OID of the standby, the location where the standby
>>> has written/flushed WAL. DBA can measure the gap from the
>>> combination of pg_current_xlog_location() and pg_standbys_xlog_location()
>>> via one query on the primary."
>>>
>>
>> This function is useful but not essential for troubleshooting, I think.
>> So I'd like to postpone it.
>>
>
> Sure; in a functional system where primary and secondary are both up,
> you can assemble the info using the new functions you just added, so
> this other one is certainly optional. I just took a brief look at the
> code of the features you added, and it looks like it exposes the minimum
> necessary to make this whole thing possible to manage. I think it's OK
> if you postpone this other bit, more important stuff for you to work on.

agreed

>
> So: the one piece of information I though was most important to expose
> here at an absolute minimum is there now. Good progress. The other
> popular request that keeps popping up here is providing an easy way to
> see how backlogged the archive_command is, to make it easier to monitor
> for out of disk errors that might prove catastrophic to replication.

I tend to disagree - in any reasonable production setup basic stulff
like disk space usage is monitored by non-application specific matters.
While monitoring backlog might be interesting for other reasons, citing
disk space usage/exhaustions seems just wrong.

[...]
>
> I'd find this extremely handy as a hook for monitoring scripts that want
> to watch the server but don't have access to the filesystem directly,
> even given those limitations. I'd prefer to have the "tried to"
> version, because it will populate with the name of the troublesome file
> it's stuck on even if archiving never gets its first segment delivered.

While fancy at all I think this goes way to far for the first cut at
SR(or say this release), monitoring disk usage and tracking log files
for errors are SOLVED issues in estabilished production setups. If you
are in an environment that does neither for each and every server
independent on what you have running on it, or a setup where the
sysadmins are clueless and the poor DBA has to hack around that fact you
have way bigger issues anyway.

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2010-01-15 07:53:01 Re: lock_timeout GUC patch
Previous Message Peter Eisentraut 2010-01-15 05:57:45 Re: per-user pg_service.conf