Re: Physical replication slot advance is not persistent

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, simon(at)2ndquadrant(dot)com
Subject: Re: Physical replication slot advance is not persistent
Date: 2020-01-20 06:45:40
Message-ID: 20200120064540.GA57042@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 16, 2020 at 08:09:09PM +0300, Alexey Kondratov wrote:
> OK, I have definitely overthought that, thanks. This looks like a minimal
> subset of changes that actually solves the bug. I would only prefer to keep
> some additional comments (something like the attached), otherwise after half
> a year it will be unclear again, why we save slot unconditionally here.

Since this email, Andres has sent an email that did not reach the
community lists, but where all the participants of this thread were in
CC. Here is a summary of the points raised (please correct me if that
does not sound right to you, Andres):
1) The slot advancing has to mark the slot as dirty, but should we
make the change persistent at the end of the function or should we
wait for a checkpoint to do the work, meaning that any update done to
the slot would be lost if a crash occurs in-between? Note that we
have this commit in slotfuncs.c for
pg_logical_replication_slot_advance():
* Dirty the slot so it's written out at the next checkpoint.
* We'll still lose its position on crash, as documented, but it's
* better than always losing the position even on clean restart.

This comment refers to the documentation for the logical decoding
section (see logicaldecoding-replication-slots in
logicaldecoding.sgml), and even if nothing can be done until the slot
advance function reaches its hand, we ought to make the data
persistent if we can.

The original commit that introduced slot advancing is 9c7d06d. Here
is the thread, where this point was not really mentioned by the way:
https://www.postgresql.org/message-id/5c26ff40-8452-fb13-1bea-56a0338a809a@2ndquadrant.com

2) pg_replication_slot_advance() includes this code, which is broken:
/* Update the on disk state when lsn was updated. */
if (XLogRecPtrIsInvalid(endlsn))
{
ReplicationSlotMarkDirty();
ReplicationSlotsComputeRequiredXmin(false);
ReplicationSlotsComputeRequiredLSN();
ReplicationSlotSave();
}
Here the deal is that endlsn, aka the LSN where the slot has been
advanced (or its current position if no progress has been done) never
gets to be set to InvalidXLogRecPtr as of f731cfa, and that this work
should be done only when endlsn has done some progress. It seems to
me that this should have been the opposite to begin with in 9c7d06d,
aka do the save if endlsn is valid.

3) The amount of testing related to slot advancing could be better
with cluster-wide operations.

@@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr
moveto)
MyReplicationSlot->data.restart_lsn = moveto;

SpinLockRelease(&MyReplicationSlot->mutex);
retlsn = moveto;
+
+ ReplicationSlotMarkDirty();
+
+ /* We moved retart_lsn, update the global value. */
+ ReplicationSlotsComputeRequiredLSN();
I think that the proposed patch is missing a call to
ReplicationSlotsComputeRequiredXmin() here for physical slots.

So, I have been looking at this patch by myself, and updated it so as
the extra slot save is done only if any advancing has been done, on
top of the other computations that had better be around for
consistency. The patch includes TAP tests for physical and logical
slots' durability across restarts.

Thoughts?
--
Michael

Attachment Content-Type Size
v6-0001-Make-physical-slot-advance-to-be-persistent.diff text/x-diff 6.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-01-20 07:02:37 Re: making the backend's json parser work in frontend code
Previous Message Paul A Jungwirth 2020-01-20 05:57:57 Re: range_agg