From: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | pgsql-docs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Mention idle_replication_slot_timeout in pg_replication_slots docs |
Date: | 2025-06-30 11:32:07 |
Message-ID: | CABdArM4KqFVoSzZBYs4jPBEo=4e1kjcOpCF8rWZSthxZFcnWdw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
On Fri, Jun 27, 2025 at 5:40 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2025/06/27 15:32, Nisha Moond wrote:
> > On Thu, Jun 26, 2025 at 1:33 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>
> >>
> >>
> >> On 2025/06/26 15:46, Nisha Moond wrote:
> >>> On Wed, Jun 25, 2025 at 9:56 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> The pg_replication_slots documentation mentions only max_slot_wal_keep_size
> >>>> as a condition under which the wal_status column can show unreserved or lost.
> >>>> However, since commit ac0e33136ab, idle_replication_slot_timeout can also
> >>>> cause this behavior when it is set. This has not been documented yet.
> >>>> https://www.postgresql.org/docs/devel/view-pg-replication-slots.html
> >>>>
> >>>
> >>> +1 to the doc update.
> >>
> >> Thanks for the review!
> >>
> >>
> >>>> So, how about updating the documentation to also mention
> >>>> idle_replication_slot_timeout as a factor that can cause wal_status to
> >>>> become unreserved or lost? Patch attached.
> >>>>
> >>>
> >>> Since idle_replication_slot_timeout can only cause wal_status to
> >>> become 'lost' and not 'unreserved', perhaps we can reword the sentence
> >>> slightly for clarity, suggestion -
> >>> "The last two states are seen when max_slot_wal_keep_size is
> >>> non-negative and, the 'lost' state may also appear when
> >>> idle_replication_slot_timeout is greater than zero."
> >>
> >> I was thinking that when idle_replication_slot_timeout triggers,
> >> the following functions are called, and that wal_status can become
> >> "unreserved" before ReplicationSlotRelease() runs. It's very short
> >> period, though. Am I wrong?
> >>
> >> ReplicationSlotMarkDirty();
> >> ReplicationSlotSave();
> >> ReplicationSlotRelease();
> >>
> >
> > Thank you for pointing it out.
> > You are correct that while the checkpointer is in the process of
> > invalidating a slot, it sets its PID as the slot’s active_pid. During
> > this short window, if a user queries pg_replication_slot, the
> > underlying function pg_get_replication_slots will compute the
> > wal_status as 'unreserved' for the invalidated slot because the slot
> > has a valid active_pid.
> >
> > That said, it's reasonable to mention in the doc that 'unreserved' may
> > appear when idle_replication_slot_timeout is greater than zero, as
> > this can indeed happen. So, let's retain the current description.
> >
> > However, this behavior isn’t specific to
> > idle_replication_slot_timeout. For example, when a slot is being
> > invalidated due to a different cause "wal_level_insufficient",
> > 'unreserved' may also briefly appear in wal_status.
>
> Yes, and "lost" can appear for various reasons, including wal_level_insufficient,
> so it seems odd to highlight max_slot_wal_keep_size as the cause of the "lost"
> status in the note. It would probably be better to remove the mention of "lost"
> from that note.
>
+1
> As for "unreserved", it can also occur for different reasons, but typically,
> it happens when max_slot_wal_keep_size is set to a non-negative value.
> So it might make sense to keep the explanation focused just on "unreserved"
> and max_slot_wal_keep_size. For example:
>
> ----------------------
> <listitem>
> <para>
> <literal>unreserved</literal> means that the slot no longer
> retains the required WAL files and some of them are to be removed at
> - the next checkpoint. This state can return
> + the next checkpoint. This can occur when
> + <xref linkend="guc-max-slot-wal-keep-size"/> is set to
> + a non-negative value. This state can return
> to <literal>reserved</literal> or <literal>extended</literal>.
> </para>
> </listitem>
> <listitem>
> ----------------------
>
> What do you think?
>
The change LGTM, only a minor suggestion to add "typically", as “This
can typically occur when…” to indicate that max_slot_wal_keep_size is
one possible reason, not the only one.
>
> Also, I noticed the note that says “If <structfield>restart_lsn</structfield>
> is NULL, this field is null” seems inaccurate. For example, when "wal_removed"
> happens, restart_lsn is NULL but wal_status is "lost". So maybe we should remove
> that note as well?
You're right, the statement is not accurate.
We could rephrase it as: "If <structfield>restart_lsn</structfield> is
NULL, this field is either null or lost." But since 'unreserved' can
also appear briefly during invalidation, it might be better to remove
it altogether.
--
Thanks,
Nisha
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2025-06-30 12:42:47 | Re: Mention idle_replication_slot_timeout in pg_replication_slots docs |
Previous Message | PG Doc comments form | 2025-06-29 21:04:14 | No information about the internal implementation for RLS |