Re: Streaming replication status

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-15 04:07:52
Message-ID: 4B4FEA18.5080705@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
>> "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns
>> one row per standby servers, showing pid of walsender, host name/
>> port number/user OID of the standby, the location where the standby
>> has written/flushed WAL. DBA can measure the gap from the
>> combination of pg_current_xlog_location() and pg_standbys_xlog_location()
>> via one query on the primary."
>>
>
> This function is useful but not essential for troubleshooting, I think.
> So I'd like to postpone it.
>

Sure; in a functional system where primary and secondary are both up,
you can assemble the info using the new functions you just added, so
this other one is certainly optional. I just took a brief look at the
code of the features you added, and it looks like it exposes the minimum
necessary to make this whole thing possible to manage. I think it's OK
if you postpone this other bit, more important stuff for you to work on.

So: the one piece of information I though was most important to expose
here at an absolute minimum is there now. Good progress. The other
popular request that keeps popping up here is providing an easy way to
see how backlogged the archive_command is, to make it easier to monitor
for out of disk errors that might prove catastrophic to replication.

I just spent some time looking through the WAL/archiving code in that
context. It looks to me that that this information isn't really stored
anywhere right now. The only thing that knows what segment is currently
queued up to copy over is pgarch_ArchiverCopyLoop via its call to
pgarch_readyXlog. Now, this is a pretty brute-force piece of code: it
doesn't remember its previous work at all, it literally walks the
archive_status directory looking for *.ready files that have names that
look like xlog files, then returns the earliest. That unfortunately
means that it's not even thinking in the same terms as all these other
functions, which are driven by the xlog_location advancing, and then the
filename is computed from that. All you've got is the filename at this
point, and it's not even guaranteed to be real--you could easily fool
this code if you dropped an inappropriately named file into that directory.

I could easily update this code path to save the name of the last
archived file in memory while all this directory scanning is going on
anyway, and then provide a UDF to expose that bit of information. The
result would need to have documentation that disclaims it like this:

pg_last_archived_xlogfile() text: Get the name of the last file the
archive_command [tried to|successfully] archived since the server was
started. If archiving is disabled or no xlog files have become ready to
archive since startup, a blank line will be returned. It is possible
for this function to return a result that does not reflect an actual
xlogfile if files are manually added to the server's archive_status
directory.

I'd find this extremely handy as a hook for monitoring scripts that want
to watch the server but don't have access to the filesystem directly,
even given those limitations. I'd prefer to have the "tried to"
version, because it will populate with the name of the troublesome file
it's stuck on even if archiving never gets its first segment delivered.

I'd happily write a patch to handle all that if I thought it would be
accepted. I fear that the whole approach will be considered a bit too
hackish and get rejected on that basis though. Not really sure of a
"right" way to handle this though. Anything better is going to be more
complicated because it requires passing more information into the
archiver, with little gain for that work beyond improving the quality of
this diagnostic routine. And I think most people would find what I
described above useful enough.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-01-15 04:20:41 Re: Streaming replication status
Previous Message Tim Bunce 2010-01-15 04:02:02 Package namespace and Safe init cleanup for plperl [PATCH]