RE: issue with synchronized_standby_slots

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
Cc: Fabrice Chapuis <fabrice636861(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: issue with synchronized_standby_slots
Date: 2025-09-09 05:38:54
Message-ID: TY4PR01MB16907911406A0DED47818733C940FA@TY4PR01MB16907.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, September 9, 2025 1:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Sep 8, 2025 at 2:56 PM Alexander Kukushkin
> <cyberdemn(at)gmail(dot)com> wrote:
> >
> > Recently we also hit this problem.
> >
> > I think in a current state check_synchronized_standby_slots() and
> validate_sync_standby_slots() functions are not very useful:
> > - When the hook is executed from postmaster it only checks that
> synchronized_standby_slots contains a valid list, but doesn't check that
> replication slots exists, because MyProc is NULL. It happens both, on start and
> on reload.
> > - When executed from other backends set_config_with_handle() is called
> with elevel = 0, and therefore elevel becomes DEBUG3, which results in no
> useful error/warning messages.
> >
> > There are a couple of places where check_synchronized_standby_slots()
> failure is not ignored:
> > 1. alter system set synchronized_standby_slots='invalid value'; 2.
> > Parallel workers, because RestoreGUCState() calls
> set_config_option_ext()->set_config_with_handle() with elevel=ERROR. As a
> result, parallel workers fail to start with the error.
> >
> > With parallel workers it is actually even worse - we get the error even in case
> of standby:
> > 1. start standby with synchronized_standby_slots referring to
> > non-existing slots 2. SET parallel_setup_cost, parallel_tuple_cost,
> > and min_parallel_table_scan_size to 0 3. Run select * from pg_namespace;
> and observe following error:
> > ERROR: invalid value for parameter "synchronized_standby_slots": "a1,b1"
> > DETAIL: replication slot "a1" does not exist
> > CONTEXT: while setting parameter "synchronized_standby_slots" to
> "a1,b1"
> > parallel worker
> >
>
> I see the same behaviour for default_table_access_method and
> default_tablespace. For example, see failure cases:
> postgres=# Alter system set default_table_access_method='missing';
> ERROR: invalid value for parameter "default_table_access_method":
> "missing"
> DETAIL: Table access method "missing" does not exist.
>
> postgres=# SET parallel_setup_cost=0;
> SET
> postgres=# SET parallel_tuple_cost=0;
> SET
> postgres=# Set min_parallel_table_scan_size to 0; SET postgres=# select *
> from pg_namespace;
> ERROR: invalid value for parameter "default_table_access_method":
> "missing"
> DETAIL: Table access method "missing" does not exist.
> CONTEXT: while setting parameter "default_table_access_method" to
> "missing"
> parallel worker
>
> OTOH, there is no ERROR on reload or restart.
>
> It is fair to argue that invalid GUC values should be ignored in certain cases like
> parallel query but we should have the same solution for other similar
> parameters as well.
>
> As for the synchronized_standby_slots, we can follow the behavior similar to
> check_synchronous_standby_names and just give parsing ERRORs. Any
> non-existent slot related errors can be given when that parameter is later used.

I agree. For synchronized_standby_slots, I think it is acceptable to report only
parsing errors, because slots could be dropped even after validating the slot
existence during GUC loading. Additionally, we would report WARNINGs for
non-existent slots during the wait function anyway (e.g., in
StandbySlotsHaveCaughtup()).

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-09-09 05:49:06 Re: pg_restore --no-policies should not restore policies' comment
Previous Message Amit Kapila 2025-09-09 05:29:47 Re: issue with synchronized_standby_slots