From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | masao(dot)fujii(at)oss(dot)nttdata(dot)com |
Cc: | alvherre(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Review for GetWALAvailability() |
Date: | 2020-06-25 08:28:03 |
Message-ID: | 20200625.172803.429475667684022055.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Thu, 25 Jun 2020 14:35:34 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2020/06/25 12:57, Alvaro Herrera wrote:
> > On 2020-Jun-25, Fujii Masao wrote:
> >
> >> /*
> >> * Find the oldest extant segment file. We get 1 until checkpoint removes
> >> * the first WAL segment file since startup, which causes the status being
> >> * wrong under certain abnormal conditions but that doesn't actually harm.
> >> */
> >> oldestSeg = XLogGetLastRemovedSegno() + 1;
> >>
> >> I see the point of the above comment, but this can cause wal_status to
> >> be
> >> changed from "lost" to "unreserved" after the server restart. Isn't
> >> this
> >> really confusing? At least it seems better to document that behavior.
> > Hmm.
> >
> >> Or if we *can ensure* that the slot with invalidated_at set always
> >> means
> >> "lost" slot, we can judge that wal_status is "lost" without using
> >> fragile
> >> XLogGetLastRemovedSegno(). Thought?
> > Hmm, this sounds compelling -- I think it just means we need to ensure
> > we reset invalidated_at to zero if the slot's restart_lsn is set to a
> > correct position afterwards.
>
> Yes.
It is error-prone restriction, as discussed before.
Without changing updator-side, invalid restart_lsn AND valid
invalidated_at can be regarded as the lost state. With the following
change suggested by Fujii-san we can avoid the confusing status.
With attached first patch on top of the slot-dirtify fix below, we get
"lost" for invalidated slots after restart.
> > I don't think we have any operation that
> > does that, so it should be safe -- hopefully I didn't overlook
> > anything?
>
> We need to call ReplicationSlotMarkDirty() and ReplicationSlotSave()
> just after setting invalidated_at and restart_lsn in
> InvalidateObsoleteReplicationSlots()?
> Otherwise, restart_lsn can go back to the previous value after the
> restart.
>
> diff --git a/src/backend/replication/slot.c
> b/src/backend/replication/slot.c
> index e8761f3a18..5584e5dd2c 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -1229,6 +1229,13 @@ restart:
> s->data.invalidated_at = s->data.restart_lsn;
> s->data.restart_lsn = InvalidXLogRecPtr;
> SpinLockRelease(&s->mutex);
> +
> + /*
> + * Save this invalidated slot to disk, to ensure that the slot
> + * is still invalid even after the server restart.
> + */
> + ReplicationSlotMarkDirty();
> + ReplicationSlotSave();
> ReplicationSlotRelease();
> /* if we did anything, start from scratch */
>
> Maybe we don't need to do this if the slot is temporary?
The only difference of temprary slots from persistent one seems to be
an attribute "persistency". Actually,
create_physica_replication_slot() does the aboves for temporary slots.
> > Neither copy nor advance seem to work with a slot that has invalid
> > restart_lsn.
> >
> >> Or XLogGetLastRemovedSegno() should be fixed so that it returns valid
> >> value even after the restart?
> > This seems more work to implement.
>
> Yes.
The confusing status can be avoided without fixing it, but I prefer to
fix it. As Fujii-san suggested upthread, couldn't we remember
lastRemovedSegNo in the contorl file? (Yeah, it cuases a bump of
PG_CONTROL_VERSION and CATALOG_VERSION_NO?).
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
0001-Make-slot-invalidation-persistent.patch | text/x-patch | 955 bytes |
0002-Show-correct-value-in-pg_replication_slots.wal_statu.patch | text/x-patch | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Vik Fearing | 2020-06-25 08:39:16 | Re: Why forbid "INSERT INTO t () VALUES ();" |
Previous Message | Michael Paquier | 2020-06-25 08:07:57 | Missing some ifndef FRONTEND at the top of logging.c and file_utils.c |