Combination with priority-based and quorum-based synchronous replications

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Combination with priority-based and quorum-based synchronous replications
Date: 2017-08-24 03:38:10
Message-ID: CAD21AoCxJ5X2mWLm2azr1WtEQLK7CP5XmhhCGJTTYKnAz5pH1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

PostgreSQL 9.6 introduced the priority-based multiple synchronous
replication and PostgreSQL 10 introduced the quorum-based one.
Initially I was thinking to use both synchronous replication ways in
combination but it's not supported yet for now. It's useful for some
use cases where for example we have three replicas: nodeA, nodeB and
nodeC. Two of them (nodeA and nodeC) are for read replica and another
one (nodeB) is for the backup. In this case we would want to replicate
the data synchronously to nodeB while replicating data to the nodeA
and nodeC using quorum-based synchronous replication. To cover such a
use case we need a feature allowing us to use both in combination.
IIUC other use cases are also mentioned on earlier discussion.

To implement there are two ideas.
1. Use two synchronous replication ways in combination. For above
example, we can set s_s_names = 'First 2 (nodeB, Any1(nodeA, nodeC))'.
This approach allows us to set a nested set of nodes in s_s_names. We
can consider supporting the more nested solution but one nested level
would be enough for most cases. Also, it would be useful if one
synchronization method can have multiple another synchronization
method. For example, I can imagine a use case where each two backup
data centers have two replicas and we want to send the data
synchronously either one replica on each data center. We can set
s_s_name = 'First 2( Any 1(nodeA, nodeB), Any 1(nodeC, nodeD))'. It
might be over engineering but it would be worth to consider it.

2. Extend quorum-based synchronous replication so that we can specify
the node that we want to replicate the data synchronously. For above
example, if we can set s_s_names = 'Any 2 (nodeA, *nodeB, nodeC)' then
the master server wait for nodeB and either nodeA or nodeC. That is, a
special mark '*' means that the marked nodes is preferentially
selected as synchronous server. This approach is more simpler than #1
approach but we cannot nest more than one and a synchronization method
cannot have an another method..

Feedback and comment are very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-08-24 03:50:45 Re: proposal: psql command \graw
Previous Message Robert Haas 2017-08-24 02:43:41 Re: [PATCH] Push limit to sort through a subquery