Re: Configuring synchronous replication

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, jd(at)commandprompt(dot)com, Thom Brown <thom(at)linux(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Configuring synchronous replication
Date: 2010-09-24 08:08:24
Message-ID: 4C9C5C78.3060507@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 24/09/10 01:11, Simon Riggs wrote:
>> But that's not what I call synchronous replication, it doesn't give
>> you the guarantees that
>> textbook synchronous replication does.
>
> Which textbook?

I was using that word metaphorically, but for example:

Wikipedia
http://en.wikipedia.org/wiki/Replication_%28computer_science%29
(includes a caveat that many commercial systems skimp on it)

Oracle docs

http://download.oracle.com/docs/cd/B10500_01/server.920/a96567/repoverview.htm
Scroll to "Synchronous Replication"

Googling for "synchronous replication textbook" also turns up this
actual textbook:
Database Management Systems by R. Ramakrishnan & others
which uses synchronous replication with this meaning, although in the
context of multi-master replication.

Interestingly, "Transaction Processing: Concepts and techniques" by
Grey, Reuter, chapter 12.6.3, defines three levels:

1-safe - what we call asynchronous
2-safe - commit is acknowledged after the slave acknowledges it, but if
the slave is down, fall back to asynchronous mode.
3-safe - commit is acknowledged only after slave acknowledges it. If it
is down, refuse to commit

In the context of multi-master replication, "eager replication" seems to
be commonly used to mean synchronous replication.

If we just want *something* that's useful, and want to avoid the hassle
of registration and all that, I proposed a while back
(http://archives.postgresql.org/message-id/4C7E29BC.3020902@enterprisedb.com)
that we could aim for behavior that would be useful for distributing
read-only load to slaves.

The use case is specifically that you have one master and one or more
hot standby servers. You also have something like pgpool that
distributes all read-only queries across all the nodes, and routes
updates to the master server.

In this scenario, you want that the master node does not acknowledge a
commit to the client until all currently connected standby servers have
replayed the commit. Furthermore, you want a standby server to stop
accepting queries if it loses connection to the master, to avoid giving
out-of-date responses. With suitable timeouts in the master and the
standby, it seems possible to guarantee that you can connect to any node
in the system and get an up-to-date result.

It does not give zero data loss like synchronous replication does, but
it keeps hot standby servers trustworthy for queries.

It bothers me that no-one seems to have a clear use case in mind. People
want "synchronous replication", but don't seem to care much what
guarantees it should provide. I wish the terminology was better
standardized in this area.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-09-24 08:43:07 Re: Configuring synchronous replication
Previous Message Markus Wanner 2010-09-24 07:51:35 Re: Configuring synchronous replication

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-09-24 08:13:19 Re: ask for review of MERGE
Previous Message Markus Wanner 2010-09-24 07:51:35 Re: Configuring synchronous replication