Simplifying replication

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Simplifying replication
Date: 2010-10-19 00:16:22
Message-ID: 4CBCE356.2080703@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert asked me to write this up, so here it is.

It is critical that we make replication easier to set up, administrate
and monitor than it currently is. In my conversations with people, this
is more important to our users and the adoption of PostgreSQL than
synchronous replication is.

First, I'm finding myself constantly needing to tutor people on how to
set up replication. The mere fact that it requires a minimum 1-hour
class to explain how to use it, or a 10-page tutoral, tells us it's too
complex. As further evidence, Bruce and I explained binary replication
to several MySQL geeks at OpenSQLCamp last weekend, and they were
horrified at the number and complexity of the steps required. As it
currently is, binary replication is not going to win us a lot of new
users from the web development or virtualization world.

I had to write it up a couple of times; I started with a critique of the
various current commands and options, but that seemed to miss the point.
So instead, let me lay out how I think replication should work in my
dream world 9.1:

1. Any postgresql standalone server can become a replication master
simply by enabling replication connections in pg_hba.conf. No other
configuration is required, and no server restart is required.

2. Should I choose to adjust master configuration, for say performance
reasons, most replication variables (including ones like
wal_keep_segments) should be changeable without a server restart.

3. I can configure a standby by copying the same postgresql.conf on the
master. I only have to change a single configuration variable (the
primary_conninfo, or maybe a replication_mode setting) in order to start
the server in standby mode. GUCs which apply only to masters are ignored.

4. I can start a new replica off the master by running a single
command-line utility on the standby and giving it connection information
to the master. Using this connection, it should be able to start a
backup snapshot, copy the entire database and any required logs, and
then come up in standby mode. All that should be required for this is
one or two highport connections to the master. No recovery.conf file is
required, or exists.

5. I can to cause the standby to fail over with a single command to the
failover server. If this is a trigger file, then it already has a
default path to the trigger file in postgresql.conf, so that this does
not require reconfiguration and restart of the standby at crisis time.
Ideally, I use a "pg_failover" command or something similar.

6. Should I decide to make the standby the new master, this should also
be possible with a single command and a one-line configuration on the
other standbys. To aid this, we have an easy way to tell which standby
in a group are most "caught up". If I try to promote the wrong standby
(it's behind or somehow incompatible), it should fail with an
appropriate message.

7. Should I choose to use archive files as well as streaming
replication, the utilities to manage them (such as pg_archivecleanup and
pg_standby) are built and installed with PostgreSQL by default, and do
not require complex settings with escape codes.

That's my vision of "simple replication". It is also 100% achieveable.
We just have to priorities ease-of-use over having, and requiring the
user to set, 1,000 little knobs.

Speaking of knobs .... (next message)

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2010-10-19 00:20:01 max_wal_senders must die
Previous Message Terri Laurenzo 2010-10-19 00:04:36 Re: PL/JS