Re: Synchronization levels in SR

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Synchronization levels in SR
Date: 2010-05-27 07:13:45
Message-ID: AANLkTinZdITDHp-F_xLTYA-ZL8XV_hRcKt5IVICNTSn4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 26, 2010 at 10:37 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> If the remote server responded first, then that proves it is a better
> candidate for failover than the one you think of as near. If the two
> standbys vary over time then you have network problems that will
> directly affect the performance on the master; synch_rep = N would
> respond better to any such problems.

No. The remote standby might respond first temporarily though it's almost
behind the near one. The read-only queries or incrementally updated
backup operation might cause a bursty disk write, and delay the ACK from
the standby. The lock contention between read-only queries and recovery
would delay the ACK. So the standby which responds first is not always
the best candidate for failover. Also the administrator generally doesn't
put the remote standby under the control of a clusterware like heartbeat.
In this case, the remote standby will never be the candidate for failover.
But quorum commit cannot cover this simple case.

>> OTOH, "synchronous_replication=2" degrades the
>> performance on the master very much.
>
> Yes, but only because you have only one near standby. It would clearly
> to be foolish to make this setting without 2+ near standbys. We would
> then have 4 or more servers; how do we specify everything for that
> config??

If you always want to use the near standby as the candidate for failover
by using quorum commit in the above simple case, you would need to choose
such a foolish setting. Otherwise, unfortunately you might have to failover
to the remote standby not under the control of a clusterware.

>> "synchronous_replication" approach
>> doesn't seem to cover the typical use case.
>
> You described the failure modes for the quorum proposal, but avoided
> describing the failure modes for the "per-standby" proposal.
>
> Please explain what will happen when the near server is unavailable,
> with per-standby settings. Please also explain what will happen if we
> choose to have 4 or 5 servers to maintain performance in case of the
> near server going down. How will we specify the failure modes?

I'll try to explain that.

(1) most standard case: 1 master + 1 "sync" standby (near)
When the master goes down, something like a clusterware detects that
failure, and brings the standby online. Since we can ensure that the
standby has all the committed transactions, failover doesn't cause
any data loss.

When the standby goes down or network outage happens, walsender
detects that failure via the replication timeout, keepalive or error
return from the system calls. Then walsender does something according
to the specified reaction (GUC) to the failure of the standby, e.g.,
walsender wakes the transaction commit up from the wait-for-ACK, and
exits. Then the master runs standalone.

(2) 1 master + 1 "sync" standby (near) + 1 "async" standby (remote)
When the master goes down, something like a clusterware brings the
"sync" standby in the near location online. The administrator would
need to take a fresh base backup of the new master, load it on the
remote standby, change the primary_conninfo, and restart the remote
standby.

When one of standbys goes down, walsender does the same thing described
in (1). Until the failed standby has restarted, the master runs together
with another standby.

In (1) and (2), after some failure happens, there would be only one server
which is guaranteed to have all the committed transactions. When it also
goes down, the database service stops. If you want to avoid this fragile
situation, you would need to add one more "sync" standby in the near site.

(3) 1 master + 2 "sync" standbys (near) + 1 "async" standby (remote)
When the master goes down, something like a clusterware brings the
one of "sync" standbys online by using some selection algorithm.
The administrator would need to take a fresh base backup of the new
master, load it on both remaining standbys, change the primary_conninfo,
and restart them.

When one of standbys goes down, walsender does the same thing described
in (1). Until the failed standby has restarted, the master runs together
with two standbys. At least one standby is guaranteed to be sync with
the master.

Is this explanation enough?

>> Also, when "synchronous_replication=1" and one of synchronous standbys
>> goes down, how should the surviving standby catch up with the master?
>> Such standby might be too far behind the master. The transaction commit
>> should wait for the ACK from the lagging standby immediately even if
>> there might be large gap? If yes, "synch_rep_timeout" would screw up
>> the replication easily.
>
> That depends upon whether we send the ACK at point #2, #3 or #4. It
> would only cause a problem if you waited until #4.

Yeah, the problem happens. If we implement quorum commit, we need to
design how the surviving standby catches up with the master.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Abhijit Menon-Sen 2010-05-27 07:15:56 Re: functional call named notation clashes with SQL feature
Previous Message Heikki Linnakangas 2010-05-27 07:12:35 Re: functional call named notation clashes with SQL feature