From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date: | 2025-08-12 08:26:14 |
Message-ID: | CANhcyEUoF+eN3pcWGU-y+LyQNYFAvJ3v+Mr9UtuVU2W3SwSYHA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 8 Aug 2025 at 03:30, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Aug 5, 2025 at 11:23 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > Please find a few comments on v6:
> >
> > 1)
> > +/*
> > + * Initialize logical decoding status on shmem at server startup. This
> > + * must be called ONCE during postmaster or standalone-backend startup,
> > + * before initializing replication slots.
> > + */
> > +void
> > +StartupLogicalDecodingStatus(bool last_status)
> >
> > The comment says that it needs to be called 'before initializing
> > replication slots' but instead it is called after initializing
> > replication slots (i.e. after StartupReplicationSlots).
>
> Removed.
>
> > Also, can you please help me understand the need of
> > 'StartupLogicalDecodingStatus' when we are doing
> > 'UpdateLogicalDecodingStatusEndOfRecovery' later in StartupXLOG. Why
> > do we need to set last_status temporarily when the new status can be
> > different which will be set in
> > UpdateLogicalDecodingStatusEndOfRecovery
>
> IIUC we need to initialize the logical decoding status with the status
> we used to use when the server shutdown or when the basebackup was
> taken. This status would be used during recovery and might be changed
> by replaying the XLOG_LOGICAL_DECODING_STATUS_CHANGE record. At the
> end of recovery, we update the status based on the server's wal_level
> and the number of logical replication slots so the new status could be
> different from the status used during the recovery.
>
> >
> >
> > 2)
> > CreatePublication() has this:
> >
> > + errmsg("logical decoding needs to be enabled to publish logical changes"),
> > + errhint("Set \"wal_level\" to \"logical\" or create a logical
> > replication slot with \"replica\" \"wal_level\" before creating
> > subscriptions.")));
> >
> > While rest of the places has this:
> >
> > + errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary.")));
> >
> > Shall we make these errhint consistent? Either all mention
> > 'wal_level=replica' condition along with slot-creation part or none.
>
> Fixed.
>
> >
> >
> > 3)
> > xlog_decode():
> >
> > + case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
> > /*
> > * This can occur only on a standby, as a primary would
> > - * not allow to restart after changing wal_level < logical
> > + * not allow to restart after changing wal_level < replica
> > * if there is pre-existing logical slot.
> > */
> > Assert(RecoveryInProgress());
> > ereport(ERROR,
> > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > - errmsg("logical decoding on standby requires \"wal_level\" >=
> > \"logical\" on the primary")));
> > + errmsg("logical decoding must be enabled on the primary")));
> >
> > Is the comment correct?
> >
> > a)
> > XLOG_LOGICAL_DECODING_STATUS_CHANGE can be result of logical-slot drop
> > on primary and not necessarily making wal_level < replica
> >
> > b)
> > I see that even standby does not allow to restart when changing
> > wal_level < replica as against what comment says. Have I understood
> > the intent correctly?
> >
> > standby LOG:
> > FATAL: logical replication slot "failover_slot_st" exists, but
> > "wal_level" < "replica"
> > HINT: Change "wal_level" to be "replica" or higher.
>
> I think I don't get your point. The comment looks correct to me.
>
> On a primary server, a logical slot can only decode WAL records that
> were generated while that slot existed. A
> XLOG_LOGICAL_DECODING_STATUS_CHANGE record with logical_decoding=false
> is generated in two cases: after the last logical slot is dropped, or
> when the server starts with no logical slot and wal_level<='replica'.
> In either case, no logical slots can exist that would be able to
> decode these WAL records. However, on standby servers, it is possible
> to decode these XLOG_LOGICAL_DECODING_STATUS_CHANGE records with
> logical_decoding=false, as standbys can decode WAL records
> independently of the primary.
>
> >
> >
> > 4)
> > start_logical_decoding_status_change():
> > + if (LogicalDecodingCtl->transition_in_progress)
> > + {
> > + LWLockRelease(LogicalDecodingControlLock);
> >
> > read_logical_decoding_status_transition() takes care of checking
> > transition_in_progress, I think we missed to remove above from
> > start_logical_decoding_status_change().
>
> Fixed.
>
> >
> >
> > 5)
> > + /* Return if we don't need to change the status */
> > + if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
> > + {
> >
> > Same with this code-logic in start_logical_decoding_status_change(),
> > we shall remove it.
>
> Fixed.
>
> >
> > 6)
> > + * If we're in recovery and the startup process is still taking
> > + * responsibility to update the status, we cannot change.
> > + */
> > + if (!delay_status_change)
> > + return false;
> > +
> >
> > This comment is confusing as when in recovery, we can not change state
> > otherwise as well even if delay_status_change is false. IIUC, the
> > scenario can arise only during promotion, if so, shall we say:
> >
> > "If we're in recovery and a state transition (e.g., promotion) is in
> > progress, wait for the transition to complete and retry on the new
> > primary. Otherwise, disallow the status change entirely, as a standby
> > cannot modify the logical decoding status."
>
> Fixed.
>
> >
> > 7)
> > The name 'delay_status_change' does not indicate which status or the
> > intent of delay. More name options are: defer_logical_status_change,
> > wait_for_recovery_transition/completion,
> > recovery_transition_in_progress
>
> I think 'delay' is used in other similar examples in PostgreSQL code.
> For instance, we have DELAY_CHKPT_START/COMPLETE/IN_COMMIT that are
> set by transactions to delay the actual checkpoint process until these
> transactions complete certain operations. In our case, the flag is set
> by the startup process in order to delay the actual status change
> process by other processes until the recovery completes. Which is a
> very similar usage so I believe 'delay' is appropriate here.
>
> Regarding the 'status', I guess it's relatively obvious in this
> context that the status indicates the logical decoding status so I'm
> not sure that readers would confuse this name.
>
> >
> > 8)
> > DisableLogicalDecodingIfNecessary():
> > +
> > + /* Write the WAL to disable logical decoding on standbys too */
> > + if (XLogStandbyInfoActive() && !recoveryInProgress)
> > + {
> >
> > Do we need 'recoveryInProgress' check here?
> > start_logical_decoding_status_change() has taken care of that.
>
> Removed.
>
> >
> > 9)
> > Comments atop DisableLogicalDecodingIfNecessary:
> >
> > * This function expects to be called after dropping a possibly-last logical
> > * replication slot. Logical decoding can be disabled only when wal_level is set
> > * to 'replica' and there is no logical replication slot on the system.
> >
> > The comment is not completely true, shall we amend the comment to say
> > something like:
> >
> > This function is called after a logical slot is dropped, but it only
> > disables logical decoding on primary if it was the last remaining
> > logical slot and wal_level < logical. Otherwise, it performs no
> > action.
>
> Thank you for the suggestion. I modified the comment based on the suggestion
>
> >
> > 10)
> > When we try to create or drop a logical slot on standby, and if
> > delay_status_change is false, shall we immediately exit? Currently it
> > does a lot of checks including CheckLogicalSlotExists() which can be
> > completely avoided. I think it is worth having a quick
> > 'RecoveryInProgress() && !delay_status_change' check in the beginning.
>
> Yeah, we can simplify the start_logical_decoding_status_change() logic more.
>
> I've attached the updated patch.
>
Hi Sawada-san,
I have reviewed the patch and have few comments:
1. There are some spelling mistakes in logicaldecoding.sgml
+ and requires waiting for any concurrent transactions to finish, ensureing
+ system-wide conistency. Conversely, when the last logical replication slot
ensureing -> ensuring
conistency -> consistency
2. In publicationcmds.c:
+ errmsg("logical decoding needs to be enabled to
publish logical changes"),
+ errhint("Set \"wal_level\" >= \"logical\" or create a
logical replication slot with \"replica\" WAL level before creating
subscriptions.")));
Should we use something like this for errhint ?
errhint("Set \"wal_level\" >= \"logical\" or create a logical
replication slot before creating subscriptions when \"wal_level\" =
\"replica\".")));
3. In logical.c:
+ errmsg("logical decoding needs to be enabled on the primary"),
+ errhint("Set \"wal_level\" >= \"logical\" or create
at least one logical slot with \"replica\" WAL level on the
primary.")));
Should we change the errhint message as below?
errhint("Set \"wal_level\" >= \"logical\" or create at least one
logical slot on the primary when \"wal_level\" = \"replica\".")));
4. In slotsync.c:
+ errmsg("replication slot synchronization requires
logical decoding to be enabled"),
+ errhint("Set \"wal_level\" >= \"logical\" or create at
least one logical slot with \"replica\" WAL level on the primary "));
Should we change the errhint message as below?
errhint("Set \"wal_level\" >= \"logical\" or create at least one
logical slot on the primary when \"wal_level\" = \"replica\".")));
---------
I have also tested the patch with creating multiple permanent/
temporary slots in concurrent sessions and I did not find any issue. I
also tested this patch with a physical replication setup.
I have a doubt in this case:
1. Suppose we have a physical replication setup. wal_level on primary
is set to 'replica'
2. Now we try to create a logical slot on standby, an error is thrown
as wal_level is set to 'replica' as primary does not have any logical
slot
3. Now we create a temporary logical slot on primary,
effective_wal_level is set to logical.
4. Now we can create slots on standby as effective_wal_level is logical.
5. Now we exit the session of the primary server. The temporary slot
is dropped. This will invalidate the slots on standby as the
effective_wal_level will be set to 'replica'.
So we can say that indirectly a temporary slot on primary can control
the behaviour of permanent slots on standbys.
I checked this behaviour in HEAD. In HEAD also the behaviour is the
same. If we change the wal_level on primary from 'logical' to
'replica', all slots are invalidated on the standby.
With patch this behaviour can be indirectly controlled by a temporary
slot. Is it fine? Thoughts?
Thanks,
Shlok Kyal
From | Date | Subject | |
---|---|---|---|
Next Message | shveta malik | 2025-08-12 08:36:30 | Re: Conflict detection for update_deleted in logical replication |
Previous Message | Heikki Linnakangas | 2025-08-12 08:01:45 | Re: Correct documentation for protocol version |