From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Review for GetWALAvailability() |
Date: | 2020-06-25 05:35:34 |
Message-ID: | a9c65f68-8ea5-5bb3-349d-6d909fff38bc@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/06/25 12:57, Alvaro Herrera wrote:
> On 2020-Jun-25, Fujii Masao wrote:
>
>> /*
>> * Find the oldest extant segment file. We get 1 until checkpoint removes
>> * the first WAL segment file since startup, which causes the status being
>> * wrong under certain abnormal conditions but that doesn't actually harm.
>> */
>> oldestSeg = XLogGetLastRemovedSegno() + 1;
>>
>> I see the point of the above comment, but this can cause wal_status to be
>> changed from "lost" to "unreserved" after the server restart. Isn't this
>> really confusing? At least it seems better to document that behavior.
>
> Hmm.
>
>> Or if we *can ensure* that the slot with invalidated_at set always means
>> "lost" slot, we can judge that wal_status is "lost" without using fragile
>> XLogGetLastRemovedSegno(). Thought?
>
> Hmm, this sounds compelling -- I think it just means we need to ensure
> we reset invalidated_at to zero if the slot's restart_lsn is set to a
> correct position afterwards.
Yes.
> I don't think we have any operation that
> does that, so it should be safe -- hopefully I didn't overlook anything?
We need to call ReplicationSlotMarkDirty() and ReplicationSlotSave()
just after setting invalidated_at and restart_lsn in InvalidateObsoleteReplicationSlots()?
Otherwise, restart_lsn can go back to the previous value after the restart.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e8761f3a18..5584e5dd2c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1229,6 +1229,13 @@ restart:
s->data.invalidated_at = s->data.restart_lsn;
s->data.restart_lsn = InvalidXLogRecPtr;
SpinLockRelease(&s->mutex);
+
+ /*
+ * Save this invalidated slot to disk, to ensure that the slot
+ * is still invalid even after the server restart.
+ */
+ ReplicationSlotMarkDirty();
+ ReplicationSlotSave();
ReplicationSlotRelease();
/* if we did anything, start from scratch */
Maybe we don't need to do this if the slot is temporary?
> Neither copy nor advance seem to work with a slot that has invalid
> restart_lsn.
>
>> Or XLogGetLastRemovedSegno() should be fixed so that it returns valid
>> value even after the restart?
>
> This seems more work to implement.
Yes.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2020-06-25 06:35:32 | Re: archive status ".ready" files may be created too early |
Previous Message | Fabien COELHO | 2020-06-25 04:56:10 | Re: Why forbid "INSERT INTO t () VALUES ();" |