| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Date: | 2025-10-23 10:39:39 |
| Message-ID: | CAJpy0uC2oXafcySxQZMVruRU4xBdPgJE67T08q02h4wLApKPNg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Please find a few comments:
1)
RequestDisableLogicalDecoding:
Shall we have 'Assert(!MyReplicationSlot)' here to ensure that this
function is invoked after we have released and dropped the slot. This
is similar to 'Assert(MyReplicationSlot)' in
EnsureLogicalDecodingEnabled().
2)
EnsureLogicalDecodingEnabled sets status_change_inprogress to false
before writing the WAL record. Initially, I thought this could lead to
a situation where another session might drop the same slot, since
there’s nothing preventing it (as status_change_inprogress is false
and LogicalDecodingControlLock has been released). This could, in
theory, result in the checkpointer writing the WAL record that
disables logical decoding before EnsureLogicalDecodingEnabled() writes
its WAL record that enables it — potentially causing an issue. But
this problem could not be reproduced in practice, since the slot was
acquired by session1, and therefore another session attempting to drop
it couldn’t acquire it. That said, I still lean towards setting
status_change_inprogress = false after the WAL record has been written
in EnsureLogicalDecodingEnabled(). Thoughts?
If not, we could add a comment explaining why this scenario is not a problem.
3)
+ # Drop the logical slot, requesting to disable logical decoding to
the checkpointer.
+ # It has to wait for the recovery to complete before disabling
logical decoding.
+ $standby5->safe_psql('postgres',
+ qq[select pg_drop_replication_slot('standby5_slot');]);
+
+ # Resume the startup process to complete the recovery.
+ $standby5->safe_psql('postgres',
+ qq[select injection_points_wakeup('startup-logical-decoding-status-change-end-of-recovery')]
+ );
+
+ $standby5->wait_for_log(
+ "waiting for recovery completion to change logical decoding status");
Shouldn’t we check the log for "waiting for recovery completion..."
before triggering injection_points_wakeup?
IIUC, the current order may cause intermittent failures. Imagine that
drop-slot has not yet reached the LogicalDecodingStatusChangeAllowed
and RecoveryInProgress checks, and we release the injection point in
the meantime. In that case, drop-slot may never end up waiting, and we
might not see the expected log message.
thanks
Shveta
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alexander Korotkov | 2025-10-23 10:46:27 | Re: Implement waiting for wal lsn replay: reloaded |
| Previous Message | Yugo Nagata | 2025-10-23 10:27:53 | Re: Can we use Statistics Import and Export feature to perforamance testing? |