From: | Fabrice Chapuis <fabrice636861(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: issue with synchronized_standby_slots |
Date: | 2025-09-07 08:15:22 |
Message-ID: | CAA5-nLCSgJTUHQRg=m41uniTYsRkxWjNjry7BdxzBh-1q0kf7g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for your reply Zhijie,
I understand that the error invalid value for parameter will be diplayed
in case of bad value for the GUC synchronized_standby_slots or if a standby
node configured is not up and running.
But the problem I noticed is that statements could not execute normally and
error code is returned to the applcation.
This append after an upgrade from PG 14 to PG 17.
I could try to reproduce the issue
Regards,
Fabrice
On Fri, Sep 5, 2025 at 6:07 AM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
wrote:
> On Thursday, September 4, 2025 9:27 PM Fabrice Chapuis <
> fabrice636861(at)gmail(dot)com> wrote:
> > With PG 17.5 and using logical replication failover slots. When trying to
> > change the value of synchronized_standby_slots, node2 was not running
> then the
> > error invalid value for parameter "synchronized_standby_slots":
> "node1,node2"
> > was generated. The problem is that statement were affected by this and
> they
> > can't execute.
> >
> > STATEMENT: select service_period,sp1_0.address_line_1 from tbl1 where
> http://sp1_0.vn=$1 order by sp1_0.start_of_period
> > 2025-08-24 13:14:29.417 CEST [848477]: [1-1]
> user=,db=,client=,application= ERROR: invalid value for parameter
> "synchronized_standby_slots": "node1,node2"
> > 2025-08-24 13:14:29.417 CEST [848477]: [2-1]
> user=,db=,client=,application= DETAIL: replication slot "s029054a" does
> not exist
> > 2025-08-24 13:14:29.417 CEST [848477]: [3-1]
> user=,db=,client=,application= CONTEXT: while setting parameter
> "synchronized_standby_slots" to "node1,node2"
> > 2025-08-24 13:14:29.418 CEST [777453]: [48-1]
> user=,db=,client=,application= LOG: background worker "parallel worker"
> (PID 848476) exited with exit code 1
> > 2025-08-24 13:14:29.418 CEST [777453]: [49-1]
> user=,db=,client=,application= LOG: background worker "parallel worker"
> (PID 848477) exited with exit code 1
> >
> > Is this issue already observed
>
> Thank you for reporting this issue. It seems you've added a nonexistent
> slot to
> synchronized_standby_slots before the server startup. The server does not
> verify
> the existence of slots at startup due to the absence of slot shared
> information,
> allowing the server to start successfully. However, when the parallel apply
> worker starts, it re-verifies the GUC setting, resulting in the ERROR you
> saw.
>
> I think this scenario is not necessarily a bug, as adding nonexistent
> slots to GUC is
> disallowed. Such slots can block the logical failover slot's advancement,
> increasing the risk of disk bloat due to WAL or dead rows, which is why we
> added
> the ERROR. There are precedents for this kind of behavior, like
> default_table_access_method and default_tablespace, which prevent queries
> if
> invalid values are set before server startup.
>
> To resolve the issue, you can remove the invalid slot from the GUC and add
> it
> back after creating the physical slot.
>
> I also thought about how to improve user experience for this, but it's not
> feasible to verify slot existence at startup because replication has not
> been
> restored to shared memory during GUC checks. Another option might be to
> simply
> remove slot existence/type checks from GUC validation.
>
> Best Regards,
> Hou zj
>
From | Date | Subject | |
---|---|---|---|
Next Message | Junwang Zhao | 2025-09-07 11:11:52 | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
Previous Message | Alastair Turner | 2025-09-07 08:12:02 | Re: Proposal: Conflict log history table for Logical Replication |