From: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fast switchover |
Date: | 2025-09-08 16:48:12 |
Message-ID: | CANzqJaAbj3-LOnTc0YaxjYROEN0h_cHqdPLn3hSPGiqkxm6j1Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, Sep 8, 2025 at 12:37 PM Klaus Darilion <klaus(dot)darilion(at)nic(dot)at>
wrote:
>
>
> *From:* Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
> *Sent:* Monday, September 8, 2025 6:10 PM
> *To:* pgsql-general(at)lists(dot)postgresql(dot)org
> *Subject:* Re: Fast switchover
>
>
>
> On Mon, Sep 8, 2025 at 11:03 AM legrand legrand <
> legrand_legrand(at)hotmail(dot)com> wrote:
>
> Hello all the readers,
>
>
>
> For some projects we need a fast *manual* switchover to address Near Zero
> downtime maintenance
>
> (not speaking here about automated failover like those provided by HA
> tools, but just planned, controlled operations)
>
>
>
>
>
> Database Physical replication switchover itself:
>
> - initial replication (before switchover) should be synchronous or
> replication LAG should be controlled to prevent data loss.
>
> - Switchover duration seems not "compressible" under a few seconds
> (because of primary shutdown, promotion, new standby catch up, ...)
>
> - Application retry strategy (after disconnection) should be tuned using
> proper retry delay. Pooler or specific driver may help.
>
>
> There will always be a few seconds delay while the applications reconnect.
>
>
>
> Do the applications connect via a VIP? That's simpler for the application.
>
>
>
> This is what I do from the not-yet-new-primary:
>
> 1. psql -h $CurrentPrimary -c "ALTER SYSTEM SET
> synchronous_standby_names TO '*';"
> 2. Wait a few seconds.
> 3. ssh $CurrentPrimary sudo ip del $VIP # cmd is more complicated, but
> you get the idea
> 4. ssh $CurrentPrimary pg_ctl stop -mfast # to kill connections, has
> to happen, no matter the solution.
>
> If you remove the VIP in step 3, the TCP connections on the client side
> are broken (may hang around), and will not be properly terminated if you
> stop postgresql in step 4. Thay may cause delays on the client detecting
> the broken TCP connection and reconnect to the server (depending on the
> network/firewall configuration on the servers). Maybe faster reconnect can
> be achieved if you first stop postgresql, and then remove the VIP.
>
Interesting. Thanks.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2025-09-08 19:54:14 | Re: Fast switchover |
Previous Message | Klaus Darilion | 2025-09-08 16:37:52 | RE: Fast switchover |