Re: [BUG] [PATCH] Allow physical replication slots to recover from archive after invalidation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Joao Foltran <joao(at)foltrandba(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] [PATCH] Allow physical replication slots to recover from archive after invalidation
Date: 2025-12-16 09:15:01
Message-ID: CAA4eK1JQRhWoMmNfUqYmVUbxCCO0VjB=aQevr6oo=uXumVzaig@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 16, 2025 at 9:54 AM Joao Foltran <joao(at)foltrandba(dot)com> wrote:
>
> Thank you for clarifying this behavior to me! I've tested it and it
> really doesn't hold back wals anymore once it has been invalidated due
> to the check inside ReplicationSlotsComputeRequiredLSN().
>
> You are correct that simply letting the slot be reacquired and
> continue working would be dangerous leading to possibly losing WALs.
> Can we then check if the standby was able to reconnect and start
> streaming successfully and then change the slots information for it to
> be considered inside ReplicationSlotsComputeRequiredLSN() again?
>
> Example:
>
> in XLogSendPhysical(), after we seen that the first record was sent:
>
> // In XLogSendPhysical() after XLogReadRecord() succeeds
> if (first_record_sent &&
> MyReplicationSlot &&
> SlotIsPhysical(MyReplicationSlot) &&
> MyReplicationSlot->data.invalidated != RS_INVAL_NONE)
> {
> // Clear invalidation - we successfully read WAL
> }
>
> This would clear the invalidation only after we know for sure that it
> can continue streaming wals without problem.
>

The slots could be invalidated due to other reasons like
RS_INVAL_IDLE_TIMEOUT as well. It doesn't sound like a good to clear
the invalidation flag of the slot because tomorrow we could decide to
invalidate due to other reasons as well. I think it would be better to
do the required forensic with invalid slots and re-create the slot if
we want to retain the required WAL. Why don't you prefer to re-create
it once the slot is invalidated?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajit Awekar 2025-12-16 09:15:35 Re: Periodic authorization expiration checks using GoAway message
Previous Message Chao Li 2025-12-16 08:39:10 Re: DOCS - Clarify the publication 'publish_via_partition_root' default value.