Synchronous Standalone Master Redoux

From: Shaun Thomas <sthomas(at)optionshouse(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Synchronous Standalone Master Redoux
Date: 2012-07-09 20:30:01
Message-ID: 4FFB3F49.4050108@optionshouse.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey everyone,

Upon doing some usability tests with PostgreSQL 9.1 recently, I ran
across this discussion:

http://archives.postgresql.org/pgsql-hackers/2011-12/msg01224.php

And after reading the entire thing, I found it odd that the overriding
pushback was because nobody could think of a use case. The argument was:
if you don't care if the slave dies, why not just use asynchronous
replication?

I'd like to introduce all of you to DRBD. DRBD is, for those who aren't
familiar, distributed (network) block-level replication. Right now, this
is what we're using, and will use in the future, to ensure a stable
synchronous PostgreSQL copy on our backup node. I was excited to read
about synchronous replication, because with it, came the possibility we
could have two readable nodes with the servers we already have. You
can't do that with DRBD; secondary nodes can't even mount the device.

So here's your use case:

1. Slave wants to be synchronous with master. Master wants replication
on at least one slave. They have this, and are happy.
2. For whatever reason, slave crashes or becomes unavailable.
3. Master notices no more slaves are available, and operates in
standalone mode, accumulating WAL files until a suitable slave appears.
4. Slave finishes rebooting/rebuilding/upgrading/whatever, and
re-subscribes to the feed.
5. Slave stays in degraded sync (asynchronous) mode until it is caught
up, and then switches to synchronous. This makes both master and slave
happy, because *intent* of synchronous replication is fulfilled.

PostgreSQL's implementation means the master will block until
someone/something notices and tells it to stop waiting, or the slave
comes back. For pretty much any high-availability environment, this is
not viable. Based on that alone, I can't imagine a scenario where
synchronous replication would be considered beneficial.

The current setup doubles unplanned system outage scenarios in such a
way I'd never use it in a production environment. Right now, we only
care if the master server dies. With sync rep, we'd have to watch both
servers like a hawk and be ready to tell the master to disable sync rep,
lest our 10k TPS system come to an absolute halt because the slave died.

With DRBD, when a slave node goes offline, the master operates in
standalone until the secondary re-appears, after which it
re-synchronizes missing data, and then operates in sync mode afterwards.
Just because the data is temporarily out of sync does *not* mean we want
asynchronous replication. I think you'd be hard pressed to find many
users taking advantage of DRBD's async mode. Just because data is
temporarily catching up, doesn't mean it will remain in that state.

I would *love* to have the functionality discussed in the patch. If I
can make a case for it, I might even be able to convince my company to
sponsor its addition, provided someone has time to integrate it. Right
now, we're using DRBD so we can have a very short outage window while
the offline node gets promoted, and it works, but that means a basically
idle server at all times. I'd gladly accept a 10-20% performance hit for
sync rep if it meant that other server could reliably act as a read
slave. That's currently impossible because async replication is too
slow, and sync is too fragile for reasons stated above.

Am I totally off-base, here? I was shocked when I actually read the
documentation on how sync rep worked, and saw that no servers would
function properly until at least two were online.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas(at)optionshouse(dot)com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2012-07-09 20:36:42 Re: Btree or not btree? That is the question
Previous Message Greg Sabino Mullane 2012-07-09 20:21:19 Re: Btree or not btree? That is the question