Re: Read Replica termination occurs when its max_active_replication_origins setting is lower than the primary

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Nazneen Jafri <jafrinazneen(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Read Replica termination occurs when its max_active_replication_origins setting is lower than the primary
Date: 2025-09-17 04:05:33
Message-ID: CAD21AoB3kcrVnTxHX+F=MOt_DWHdKFBLKaBBeOcsUyW6jWEMfQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Sep 16, 2025 at 7:45 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Tue, Sep 16, 2025 at 04:42:39PM -0700, Masahiko Sawada wrote:
> > On Tue, Sep 16, 2025 at 3:52 PM Nazneen Jafri <jafrinazneen(at)gmail(dot)com> wrote:
> >> The parameter max_active_replication_origins should be added to the list
> >> of mandatory settings that must match between primary and replica during
> >> creation
> >>
> >> [...]
> >
> > Thank you for the report!
> >
> > As reported, the standby could not continue the recovery (especially
> > replaying XLOG_REPLORIGIN_ records) if its
> > max_active_replication_origins is less than the primary's setting. One
> > idea to fix this issue is to require for standbys to have at least the
> > same max_active_replication_origins value as the primary as we do for
> > other GUC parameters such as max_worker_processes and max_wal_senders.
> > It needs to add max_active_replication_origins to the control file and
> > bumps the PG_CONTROL_VERSION. Given that we've released 18RC1 and
> > probably are close to 18 release, I'd like to hear opinions whether
> > such a fix is acceptable or not.
>
> I haven't tried reproducing it on older versions (with
> max_replication_slots instead of max_active_replication_origins), but after
> looking at the code for a bit, I'm growing skeptical that this is new to
> v18.

Right, it's actually not a new behavior to v18 as we can reproduce it
with max_replication_slots. I guess that the reason why we didn't
require standbys to set max_replication_slots no smaller than the
primary's value is that in principle the maximum number of replication
slots is not related to the recovery work. max_replication_slots juse
used to be re-used for the maximum number of active replication
origins for the sake of simplicity. Now that we have separated the
maximum number of active replication origins from
max_replication_slots, it seems to me that
max_active_replication_origins is now clearly related to the recovery.

> In any case, the PANIC provides a clear error message, which is
> roughly the same as what we'd say with the control file approach, right?

Yes. With the control file approach, we raise a FATAL (or pause the
recovery with a WARNING) instead of PANIC.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Nathan Bossart 2025-09-17 04:23:18 Re: Read Replica termination occurs when its max_active_replication_origins setting is lower than the primary
Previous Message Nathan Bossart 2025-09-17 02:45:39 Re: Read Replica termination occurs when its max_active_replication_origins setting is lower than the primary