Re: Automatic failback

From: Asad Ali <asadalinagri(at)gmail(dot)com>
To: Wasim Devale <wasimd60(at)gmail(dot)com>
Cc: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Automatic failback
Date: 2024-09-19 06:00:26
Message-ID: CAJ9xe=sr1j59XpXpwHpU4Ez12L=WHOwG=Vj1YH9WyBPEy8qWzQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Wasim,

To achieve automatic failback with minimal or zero downtime during disaster
recovery (DR) using *Barman* and PostgreSQL in your Azure setup, here’s a
high-level architecture and strategy you can follow:

1. Set up Barman is in the Azure West Region to back up the PostgreSQL
database from the Azure East Region. Use streaming replication to keep the
DR database up-to-date with the primary database.

- *Primary Database:* Configure continuous WAL streaming to the standby
in the West region.

(archive_mode = on, archive_command = 'barman-wal-archive').

- *Standby Database:* Configure this as a hot standby (read-only), ready
to be promoted in case of failover. Configure it to receive WAL data via
streaming replication.

2. Implement an automatic failover mechanism using a tool like *Patroni* or
*pg_auto_failover*. These tools monitor the primary database and, in case
of failure, automatically promote the standby database to the primary role.

- *Patroni*: A cluster manager for PostgreSQL with high availability,
automatically promoting a standby to primary when a failure is detected.
- *pg_auto_failover*: Another option that provides automatic failover
between primary and standby PostgreSQL databases, making sure the standby
can seamlessly take over.

3. After recovery, once the primary database in the east region becomes
available again, you need to set up *automatic failback*. Here’s how you
can handle failback:

-

*Step 1: Re-establish Streaming Replication*: After promoting the DR
database in the west region, reconfigure the primary in the east region as
a standby. This can be done by setting up streaming replication from the
promoted DR database (west) back to the original primary (east).
- Reconfigure the old primary to become a replica of the new primary
(which is the DR site in the west).
- Barman can assist with this by restoring the latest backup and
setting up WAL streaming to the original region.
-

*Step 2: Reverse the Failover (Failback)*: Once the original region is
stable, you can reverse the failover with zero downtime:
- Stop write operations on the current primary (west).
- Perform a controlled failover back to the original primary in the
east, making it the new primary.
- Reconfigure the DR site in the west region to again become a
standby replica.

This can be automated using *Patroni* or *pg_auto_failover*, ensuring
seamless transitions between primary and standby without user intervention

4. To further minimize downtime during failback, you can use *logical
replication*:

- After failover, set up logical replication from the new primary (west)
to the original primary (east) while the original primary is still
functioning as a read-only standby.
- Once logical replication has caught up, you can promote the original
primary (east) with virtually no downtime, ensuring seamless failback.

This will ensure that your database is always available and that there is
no downtime during a failover.

Let me know if you have any other questions.

Best regards,
Asad Ali

On Wed, Sep 18, 2024 at 5:17 PM Wasim Devale <wasimd60(at)gmail(dot)com> wrote:

> Hi All
>
> I have barman tool in place and can any one suggest automatic failback
> with zero down time.
>
> My PG database is hosted on Linux Red Hat 9. Our all Azure resources are
> on east region. We are planning to do DR disaster recovery in west region.
>
> Thanks,
> Wasim
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Sabyasachi Mukherjee 2024-09-19 09:05:37 Connecting Postgres SQL from Power BI
Previous Message tiamoh m 2024-09-18 21:44:25 Re: SSL Connection String