From: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
---|---|
To: | Hugo DUBOIS <hdubois(at)scaleway(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Unexpected Standby Shutdown on sync_replication_slots change |
Date: | 2025-07-24 15:55:26 |
Message-ID: | CAHGQGwGKT=qODdxfE7zxmCjkYYGyKqS9-yQ8NcxRROeKRYOJeA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois(at)scaleway(dot)com> wrote:
>
> Hello,
>
> I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slots parameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, the standby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter change, especially when the documentation doesn't indicate such an outcome.
>
> Steps to Reproduce
>
> Set up a physical replication between two PostgreSQL 17.5 instances.
>
> Ensure wal_level on the primary (and consequently on the standby) is set to replica.
>
> Start both the primary and standby instances, confirming replication is active.
>
> On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query: ALTER SYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();)
>
> Expected Behavior
>
> I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback behaves when not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled). A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipated behavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or incompatible wal_level would lead to a shutdown.
>
> Actual Behavior
>
> Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby instance immediately shuts down with the following log messages:
>
> LOG: database system is ready to accept read-only connections
> LOG: started streaming WAL from primary at 0/3000000 on timeline 1
> LOG: received SIGHUP, reloading configuration files
> LOG: parameter "sync_replication_slots" changed to "on"
> FATAL: replication slot synchronization requires "wal_level" >= "logical"
>
> Environment
>
> PostgreSQL Version: 17.5
Thanks for the report!
I was able to reproduce the issue even on the latest master (v19dev).
I agree that the current behavior—where changing a GUC parameter can
cause the server to shut down—is unexpected and should be avoided.
From what I’ve seen in the code, the problem stems from postmaster
calling ValidateSlotSyncParams() before starting the slot sync worker.
That function raises an ERROR if wal_level is not logical while
sync_replication_slots is enabled. Since ERROR is treated as FATAL
in postmaster, it causes the server to exit.
To fix this, we could modify ValidateSlotSyncParams() so it doesn’t
raise an ERROR in this case, as follows.
ValidateSlotSyncParams(int elevel)
{
/*
* Logical slot sync/creation requires wal_level >= logical.
- *
- * Since altering the wal_level requires a server restart, so error out in
- * this case regardless of elevel provided by caller.
*/
if (wal_level < WAL_LEVEL_LOGICAL)
- ereport(ERROR,
+ {
+ ereport(elevel,
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("replication slot synchronization requires \"wal_level\" >=
\"logical\""));
+ return false;
+ }
Regards,
--
Fujii Masao
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2025-07-24 16:30:49 | Re: BUG #18964: `ALTER DATABASE ... RESET ...` fails to reset extension parameters that no longer exist |
Previous Message | PG Bug reporting form | 2025-07-24 14:45:17 | BUG #18997: Two equivalent queries return different results |