Re: Understanding streaming replication

From: Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Understanding streaming replication
Date: 2012-11-12 18:11:11
Message-ID: CAMnJ+BfAQn6Y-FsqDrnt9Q45Xr9CiQ-u6Q=xSZzAj8enwgo_2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Nov 12, 2012 at 1:36 AM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>wrote:

> I'll try to answer the questions I can.
>

Thank you!

> Pawel Veselov wrote:
> > I've been struggling with understanding all the necessary pieces for
> streaming replication. So I put
> > down the pieces as I did understand them, and would appreciate if you
> guys could point out any of the
> > stuff I understood or have done wrong.
> >
> > The set up is pgpool + streaming replication + hot stand by. No load
> balancing, stand-by nodes will
> > not receive any application queries (I don't have that big of a query
> load, and I don't want to risk
> > inconsistent reads). There are no shared file systems, but there is a
> way to rsync/scp files between
> > nodes. Fail-over is automatic, and should kick in within reasonably
> small period after master failure.
> >
> > 1. Archiving. Should be turned on on all the nodes. The archive command
> should copy the archive file
> > to the local archive directory, and rsync archive directory between all
> the nodes. My understanding is
> > that archiving is necessary if a stand-by node ever "missed" enough WAL
> updates to need an old enough
> > WAL that might have been removed from pg_xlog.
> You don't give details about how the rsync is triggered,

but I'd advise against having rsync as part of archive_command.
> First, it is slow and if there is a lot of activity, the
> archiver will not be able to keep up.
> Second, if rsync fails, the WAL file will not be considered
> archived.
>
> Both these things will keep the WAL files from being deleted
> from pg_xlog.
>
> I'd schedule rsync as a cron job or similar.
>

From your later comments, it's also apparent that these archived WALs will
be useless after failover (for the purpose of recovery), so there is no
reason to send them to all the nodes after all.

>
> > QUESTION: After the failover, the new master will start archiving its
> WAL files. These archived WALs
> > will not collide in any way with the archived WALs generated by previous
> master(s)?
>
> They will not, because the standby starts a new "time line"
> when it is promoted to primary, which will result in new
> WAL file names.
>
> > QUESTION: What is a good policy for archive clean up? From the
> perspective to only remove archive
> > files that are guaranteed to never be required by any nodes.
>
> You cannot tell from the primary's side.
> Since you also need the archives to restore an online backup,
> I'd keep them a long as your backup policy dictates.
> I hope you don't rely on standby databases for backup (just
> imagine an accidental DROP TABLE that gets propagated to all
> standbys withing seconds).
>

I don't relay on stand-by's for back up. But that timeline establishment
business is a key piece that I didn't realize.

>
> > 2. Failover. On master failure, pgpool will automatically select a new
> master, and degenerate all
> > other nodes. The cluster is now in the emergency state and requires
> manual intervention for
> > reconfiguration and recovery. pgpool executes a script to promote a
> node, that script will create a
> > trigger file on a newly selected master node, and postgres will exist
> stand-by mode.
> >
> > QUESTION: If multiple pgpools are running, and if there are no network
> problems, and configuration
> > files are identical, is there any guarantee that the same stand-by node
> will be selected for
> > promotion? Concern here is that with configuration of (M-SB0-SB1) one
> pgpool decides to promote SB0
> > and another - SB1, causing both of them to enter master mode, and
> splitting the cluster. It does look
> > that pgpool will always select next "alive" node for promotion, but I
> couldn't find a definitive
> > statement on that.
>
> I don't know about pgpool and its abilities to handle
> cluster failover, but I wouldn't go this way at all.
> Even if the answer were that in the circumstances you
> describe things would work, you can depend on it that
> things will go wrong in ways different from what you
> expect, e.g. a broken network card.
> The consequences would be worse than I'd like to imagine.
>

I would imagine this situation will happen in any case, I don't logically
see how it's avoidable. If you only have one agent that has power to
promote a node to be a new master, you have SPF. If you have multiple
agents that can do the promotion, there is always a risk that they fall out
of sync.

> If you want reliable automatic failover, consider cluster
> software.
>

Anything you could please recommend?

[skipped]

> Yours,
> Laurenz Albe
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pawel Veselov 2012-11-12 18:37:50 Re: Understanding streaming replication
Previous Message Philippe Amelant 2012-11-12 16:39:36 Re: Understanding streaming replication