Re: Physical replication slot advance is not persistent

From: Andres Freund <andres(at)anarazel(dot)de>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, simon(at)2ndquadrant(dot)com
Subject: Re: Physical replication slot advance is not persistent
Date: 2020-01-20 19:00:14
Message-ID: 20200120190014.h4pvoep2t44foy4y@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-01-20 15:45:40 +0900, Michael Paquier wrote:
> On Thu, Jan 16, 2020 at 08:09:09PM +0300, Alexey Kondratov wrote:
> > OK, I have definitely overthought that, thanks. This looks like a minimal
> > subset of changes that actually solves the bug. I would only prefer to keep
> > some additional comments (something like the attached), otherwise after half
> > a year it will be unclear again, why we save slot unconditionally here.
>
> Since this email, Andres has sent an email that did not reach the
> community lists, but where all the participants of this thread were in
> CC.

Ugh, that was an accident.

> Here is a summary of the points raised (please correct me if that
> does not sound right to you, Andres):

> 1) The slot advancing has to mark the slot as dirty, but should we
> make the change persistent at the end of the function or should we
> wait for a checkpoint to do the work, meaning that any update done to
> the slot would be lost if a crash occurs in-between? Note that we
> have this commit in slotfuncs.c for
> pg_logical_replication_slot_advance():
> * Dirty the slot so it's written out at the next checkpoint.
> * We'll still lose its position on crash, as documented, but it's
> * better than always losing the position even on clean restart.
>
> This comment refers to the documentation for the logical decoding
> section (see logicaldecoding-replication-slots in
> logicaldecoding.sgml), and even if nothing can be done until the slot
> advance function reaches its hand, we ought to make the data
> persistent if we can.

That doesn't really seem like a meaningful reference, because the
concerns between constantly streaming out changes (where we don't want
to fsync every single transaction), and doing so in a manual advance
through an sql function, seem different.

> 3) The amount of testing related to slot advancing could be better
> with cluster-wide operations.
>
> @@ -370,6 +370,11 @@ pg_physical_replication_slot_advance(XLogRecPtr
> moveto)
> MyReplicationSlot->data.restart_lsn = moveto;
>
> SpinLockRelease(&MyReplicationSlot->mutex);
> retlsn = moveto;
> +
> + ReplicationSlotMarkDirty();
> +
> + /* We moved retart_lsn, update the global value. */
> + ReplicationSlotsComputeRequiredLSN();
> I think that the proposed patch is missing a call to
> ReplicationSlotsComputeRequiredXmin() here for physical slots.

Hm. It seems ok to include, but I don't think omitting it currently has
negative effects?

> So, I have been looking at this patch by myself, and updated it so as
> the extra slot save is done only if any advancing has been done, on
> top of the other computations that had better be around for
> consistency.

Hm, I don't necessarily what that's necessary.

> The patch includes TAP tests for physical and logical slots'
> durability across restarts.

Cool!

> diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
> index bb69683e2a..af3e114fc9 100644
> --- a/src/backend/replication/slotfuncs.c
> +++ b/src/backend/replication/slotfuncs.c
> @@ -359,17 +359,20 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
> * checkpoints.
> */
> static XLogRecPtr
> -pg_physical_replication_slot_advance(XLogRecPtr moveto)
> +pg_physical_replication_slot_advance(XLogRecPtr moveto, bool *advance_done)
> {
> XLogRecPtr startlsn = MyReplicationSlot->data.restart_lsn;
> XLogRecPtr retlsn = startlsn;
>
> + *advance_done = false;
> +
> if (startlsn < moveto)
> {
> SpinLockAcquire(&MyReplicationSlot->mutex);
> MyReplicationSlot->data.restart_lsn = moveto;
> SpinLockRelease(&MyReplicationSlot->mutex);
> retlsn = moveto;
> + *advance_done = true;
> }
>
> return retlsn;

Hm. Why did you choose not to use endlsn as before (except being
broken), or something? It seems quite conceivable somebody is using
these functions in an extension.

> +# Test physical slot advancing and its durability. Create a new slot on
> +# the primary, not used by any of the standbys. This reserves WAL at creation.
> +my $phys_slot = 'phys_slot';
> +$node_master->safe_psql('postgres',
> + "SELECT pg_create_physical_replication_slot('$phys_slot', true);");
> +$node_master->psql('postgres', "
> + CREATE TABLE tab_phys_slot (a int);
> + INSERT INTO tab_phys_slot VALUES (generate_series(1,10));");
> +my $psql_rc = $node_master->psql('postgres',
> + "SELECT pg_replication_slot_advance('$phys_slot', 'FF/FFFFFFFF');");
> +is($psql_rc, '0', 'slot advancing works with physical slot');

Hm. I'm think testing this with real LSNs is a better idea. What if the
node actually already is past FF/FFFFFFFF at this point? Quite unlikely,
I know, but still. I.e. why not get the current LSN after the INSERT,
and assert that the slot, after the restart, is that?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2020-01-20 19:01:20 Re: Index Skip Scan
Previous Message Vik Fearing 2020-01-20 18:52:51 Re: Greatest Common Divisor