From: | Fabrice Chapuis <fabrice636861(at)gmail(dot)com> |
---|---|
To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: synchronized_standby_slots used in logical replication |
Date: | 2025-06-05 09:40:31 |
Message-ID: | CAA5-nLCKwP3qHUH7z0=bh4Uzwe5Km9T_v5M9mAefp29hMPzTqg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you very much for the detailed response. I will proceed with the
native implementation for synchronizing logical replication slots. In a
maintenance context, when standby is shutdown, it's possible to
temporarily disable the synchronized_standby_slots parameter to avoid
blocking logical replication on the primary.
Regards
Fabrice
On Thu, Jun 5, 2025 at 8:57 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> On Wed, Jun 4, 2025 at 4:01 PM Fabrice Chapuis <fabrice636861(at)gmail(dot)com>
> wrote:
> >
> > Hi,
> >
> > I'm working with logical replication in a PostgreSQL 17 setup, and I'm
> exploring the new synchronized_standby_slots parameter to make replication
> slots failover safe in a highly available environment using physical
> standby nodes managed by Patroni.
> >
> > While testing this feature, I encountered a blocking behavior, when a
> standby is listed in synchronized_standby_slots and that standby goes
> offline, logical replication on the primary stops progressing. From what I
> understand, the primary node waits for the standby to acknowledge received
> wal records, effectively stalling WAL decoding for the logical slot. I
> noticed that the failover slot on the standby continue to be synced.
>
> Yes, your understanding is correct.
>
> >
> > This raises several questions about the tradeoffs and implications of
> using this feature:
> >
> > What are the risks or limitations if synchronized_standby_slots is left
> empty (the default)? Is there a risk of data loss or inconsistency for
> logical subscribers in such cases?
>
> If the 'synchronized_standby_slots' setting is left unset, logical
> replication subscribers may progress ahead of the physical standby
> servers. In the event of a failover under such conditions, the new
> primary might lack the necessary data to continue supporting logical
> replication, even if synchronized slots are in place, resulting in
> unexpected behavior. Therefore, it is strongly recommended to
> configure 'synchronized_standby_slots' properly to ensure that all
> configured physical standbys have received and flushed the changes
> before those changes are made visible to logical replication
> subscribers.
>
>
> > Is it expected behavior that any failure of a standby listed in
> synchronized_standby_slots stalls logical decoding on the primary? If so,
> are there any ways to avoid blocking WAL decoding while still having slot
> synchronization?
>
> Yes, this is expected behavior. It is similar to how
> 'synchronous_standby_names' works, where a commit on the primary is
> allowed to proceed only after the configured standby servers
> acknowledge receipt of the data. The main difference is that
> 'synchronous_standby_names' provides more configuration options, such
> as FIRST and ANY, allowing the system to wait for a subset of standbys
> rather than all of them. However, if none of the configured standbys
> are available, the primary will still wait, just like in this case
> until a standby becomes available or the configuration is changed. In
> the future, if needed, similar flexibility (e.g., support for ANY,
> FIRST) could potentially be extended to 'synchronized_standby_slots'
> as well. For now, the way to move forward is either by updating the
> configuration or by restoring the standby to an operational state.
>
>
> > Patroni is managing FO slots better than native Postgres impletmentation?
>
> I'm not entirely certain about that. However, PostgreSQL does handle
> several complex scenarios, such as:
> --Ensuring seamless logical replication on failover by allowing users
> to configure potential failover candidates via
> synchronized_standby_slots, making synced slots ready for failover in
> all the situations.
> --To ensure consistency, we avoid direct copy of slot unless a
> consistent point could be reached with the new values. Otherwise after
> promotion, the slots may not reach a consistent point, potentially
> resulting in data loss.
> --Supporting two-phase transactions for failover slots, where
> transactions prepared before two_phase decoding is enabled are handled
> correctly even if the failover occurs immediately afterward.
>
> You may want to check with the Patroni community for more detailed
> insights. We're open to considering any gaps or missing functionality
> in PostgreSQL as well.
>
> thanks
> Shveta
>
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2025-06-05 09:56:26 | Re: Fix slot synchronization with two_phase decoding enabled |
Previous Message | Dilip Kumar | 2025-06-05 09:23:21 | Re: Fix slot synchronization with two_phase decoding enabled |