Re: Pausing log shipping for streaming replication

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Joseph Kregloh <jkregloh(at)sproutloud(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Pausing log shipping for streaming replication
Date: 2014-12-15 19:41:08
Message-ID: 548F3954.8040501@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/15/2014 11:12 AM, Joseph Kregloh wrote:
> Hello,
>
> I have a master multi slave streaming replication setup. One master and
> two slaves. I need to do some maintenance on one of the slaves as one of
> the drives died however there is some other weird things going on in
> that array that I would need to investigate. So I am expecting the
> machine to be down at least two hours.
>
> I remember reading that if a master cannot connect to the slave it would
> hold the log file from shipping. Is there any other way to hold the file
> until the slave comes back online? Would it affect both slaves not
> getting their files shipped over?
>
> The good thing is that the slave in question is not serving any connections.
>
> From what I remember emptying out the archive_command would pause log
> shipping. Can the same be done by issuing a pg_stop_backup()?
>
> Thanks,
> -Joseph Kregloh

I kinda turn mine around, so to speak.
My master (web1) PG has this:

archive_command = '/usr/local/bin/pg_arch.sh "%p" "%f"'

/usr/local/bin/pg_arch.sh:
---------------
#!/bin/bash
# pg_arch.sh "%p" "%f"

archive='/pub/pgarchive'
set -e
if [ ! -f $archive/$2 ]
then
/usr/bin/cp $1 $archive/webserv/$2
/usr/bin/ln $archive/webserv/$2 $archive/web2/$2
fi
exit 0
---------------

I have one master (web1) and two slaves (web2 and webserv)

This always copies, and always returns 0. (Note the use of ln so extra
disk space isn't wasted).

At this point I only collect wall, this script never removes it. One
slave is very close, so it gets updated quickly. The other is very far,
and only updates at night when I can copy for less $$.

It doesnt really matter how I get the two slaves updated (the close one,
actually, uses steaming. The far one rsync, but that's besides the point)

The clean up happens in reverse. I have a perl cron job that runs every
half hour. It connects to master (web1) and each slave, and runs
something like:

$db is a slave, $master is the master.

$q = $db->prepare("SELECT pg_last_xlog_replay_location()");
$q->execute();
my ($webrp) = $q->fetchrow_array();
$q = undef;
$db->disconnect();

$q = $master->prepare("select file_name from pg_xlogfile_name_offset(
'$webrp' )");
$q->execute();
my ($web) = $q->fetchrow_array();
$q = undef;

system("/usr/bin/sudo -u postgres /usr/local/pgsq/bin/pg_archivecleanup
/pub/pgarchive/web2 $web")

What we do is have the master query the slave's location and then run
pg_archivecleanup. This way if we loose communication I wont clean up
files. And I wont clean up WAL until the slave has actually applied it.

And each slave is independent, so I can take one down and the master
will just keep collecting. As soon as I bring the slave back up, we get
a response and starting cleaning again.

-Andy

In response to

Browse pgsql-general by date

  From Date Subject
Next Message harpagornis 2014-12-15 19:41:41 SSL Certificates in Windows 7 & Postgres 9.3
Previous Message Patrick Krecker 2014-12-15 19:18:20 Re: Pausing log shipping for streaming replication