Skip site navigation (1) Skip section navigation (2)

Re: Streaming replication status

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming replication status
Date: 2010-01-13 08:47:33
Message-ID: 4B4D88A5.2050905@2ndquadrant.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Stefan Kaltenbrunner wrote:
> so is there an actually concrete proposal of _what_ interals to expose? '

The pieces are coming together...summary:

-Status quo:  really bad, but could probably ship anyway because 
existing PITR is no better and people manage to use it
-Add slave pg_current_xlog_location() and something like 
pg_standby_received_xlog_location():  Much better, gets rid of the worst 
issues here.
-Also add pg_standbys_xlog_location() on the master:  while they could 
live without it, this really helps out the "alert/monitor" script writer 
whose use cases keep popping up here.

Details...the original idea from Fujii was:

"I'm thinking something like pg_standbys_xlog_location() [on the 
primary] which returns
one row per standby servers, showing pid of walsender, host name/
port number/user OID of the standby, the location where the standby
has written/flushed WAL. DBA can measure the gap from the
combination of pg_current_xlog_location() and pg_standbys_xlog_location()
via one query on the primary."

After some naming quibbles and questions about what direction that 
should happen in, Tom suggested the initial step here is:

"It seems to me that we should have at least two functions available
on the slave: latest xlog location received and synced to disk by
walreceiver (ie, we are guaranteed to be able to replay up to here);
and latest xlog location actually replayed (ie, the state visible
to queries on the slave).  The latter perhaps could be
pg_current_xlog_location()."

So there's the first two of them:  on the slave, 
pg_current_xlog_location() giving the latest location replayed, and a 
new one named something like pg_standby_received_xlog_location().  If 
you take the position that an unreachable standby does provide answers 
to these questions too (you just won't like them), this pair might be 
sufficient to ship.

To help a lot at dealing with all the error situations where the standby 
isn't reachable and segments are piling up (possibly leading to full 
disk), the next figure that seems to answer the most questions is asking 
the primary "what's the location of the last WAL segment file in the 
pile of ones to be archived/distributed that has been requested (or 
processed if that's the easier thing to note) by the standby?".  That's 
what is named pg_standbys_xlog_location() in the first paragraph I 
quoted.  If you know enough to identify that segment file on disk, you 
can always look at its timestamp (and the ones on the rest of the files 
in that directory) in a monitoring script to turn that information into 
segments or a time measurement instead--xlog segments are nicely ordered 
after all.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com  www.2ndQuadrant.com


In response to

Responses

pgsql-hackers by date

Next:From: Fujii MasaoDate: 2010-01-13 08:47:48
Subject: Re: Streaming replication and non-blocking I/O
Previous:From: Boszormenyi ZoltanDate: 2010-01-13 08:22:28
Subject: Re: ECPG patch causes warning

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group