Re: Re: new high availability feature for the system with both asynchronous and synchronous replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Higuchi, Daisuke" <higuchi(dot)daisuke(at)jp(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: new high availability feature for the system with both asynchronous and synchronous replication
Date: 2017-02-28 08:10:42
Message-ID: CAD21AoAPkj0=bYbVhcDJOC+1Ss74q+DhGxT4KyHCiRNgbLcJ7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 28, 2017 at 1:56 PM, Higuchi, Daisuke
<higuchi(dot)daisuke(at)jp(dot)fujitsu(dot)com> wrote:
> Hi all,
>
> I create POC patch for my proposal of new feature for high availability.
> I want to discuss about this feature. But this feature might be PG11
> because discussion is not enough.
>
> This patch enables walsender for async to wait until walsender for sync confirm
> WAL is flashed to Disk. This feature is activated when GUC parameter
> "async_walsender_delay" is set on.
>
> I write the case when this feature is useful (this is the same as I wrote before):
> 1. Primary and synchronous standby are in the same center; called main center.
> 2. Asynchronous standby is in the another center; called backup center.
> (The backup center is located far away from the main center. If replication
> mode is synchronous, performance will be deteriorated. So, this replication
> must be Asynchronous. )
> 3. Asynchronous replication is performed in the backup center too.
> 4. When primary in main center abnormally stops, standby in main center is
> promoted, and the standby in backup center connects to the new primary.
>
> [Main center]
> |--------------------------------------------|
> | |----------| synchronous |----------| |
> | | | replication | | |
> | | primary | <--------------> | standby1 | |
> | |----------| |----------| |
> |----||--------------------------------------|
> ||
> || asynchronous
> || replication
> ||
> || [Backup center]
> |----||--------------------------------------|
> | |----------| asynchronous |----------| |
> | | | replication | | |
> | | standby2 | <--------------> | standby3 | |
> | |----------| |----------| |
> |--------------------------------------------|
>
> When the load in the main center becomes high, although WAL reaches standby in
> backup center, WAL may not reach synchronous standby in main center for various
> reasons. In other words, standby in the backup center may advance beyond
> synchronous standby in main center.
>
> When the primary abnormally stops and standby in main center promotes, two
> standbys in backup center must be recovered by pg_rewind. However, it is
> necessary to stop new primary for pg_rewind. If pg_basebackup is used,
> recovery of backup center takes some times. This is not high availability.
>

If the standby server in main center promoted to the new primary
server, why do we need to stop it in order to execute pg_rewind to the
standbys on backup center? I guess you don't need to stop the new
primary server by using --source-server option.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-02-28 08:14:40 Re: Documentation improvements for partitioning
Previous Message Masahiko Sawada 2017-02-28 07:54:07 Re: Transactions involving multiple postgres foreign servers