Re: replication slot restart_lsn initialization

From: Andres Freund <andres(at)anarazel(dot)de>
To: Gurjeet Singh <gurjeet(at)singh(dot)im>
Cc: "Duran, Danilo" <danilod(at)amazon(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: replication slot restart_lsn initialization
Date: 2015-08-10 14:12:54
Message-ID: 20150810141254.GF16192@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-07-07 09:42:54 -0700, Gurjeet Singh wrote:
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -40,10 +40,10 @@
> #include <sys/stat.h>
>
> #include "access/transam.h"
> +#include "access/xlog_internal.h"
> #include "common/string.h"
> #include "miscadmin.h"
> #include "replication/slot.h"
> -#include "storage/fd.h"
> #include "storage/proc.h"
> #include "storage/procarray.h"

Why did you remove fd.h? The file definitely uses infrastructure from
there. We're not terribly consistent about that but I'd rather not rely
on it being included via xlog_internal.h -> xlog.h.

> /*
> + * Grab and save an LSN value to prevent WAL recycling past that point.
> + */
> +void
> +ReplicationSlotRegisterRestartLSN()
> +{

Didn't like that description and function name very much. What does
'grabbing' mean here? Should probably mention that it works on the
currently active slot and modifies it.

It's now ReplicationSlotReserveWal()

> + ReplicationSlot *slot = MyReplicationSlot;
> +
> + Assert(slot != NULL);
> + Assert(slot->data.restart_lsn == InvalidXLogRecPtr);
> +
> + /*
> + * The replication slot mechanism is used to prevent removal of required
> + * WAL. As there is no interlock between this and checkpoints required, WAL
> + * segment could be removed before ReplicationSlotsComputeRequiredLSN() has
> + * been called to prevent that. In the very unlikely case that this happens
> + * we'll just retry.
> + */

You removed some punctuation in that sentence converting a sentence in
bad english into one without the original meaning ;). See the attached
for a new version.

> + while (true)
> + {
> + XLogSegNo segno;
> +
> + /*
> + * Let's start with enough information if we can, so log a standby
> + * snapshot and start logical decoding at exactly that position.
> + */
> + if (!RecoveryInProgress())
> + {
> + XLogRecPtr flushptr;
> +
> + /* start at current insert position */
> + slot->data.restart_lsn = GetXLogInsertRecPtr();
> +
> + /*
> + * Log an xid snapshot for logical replication. This snapshot is not
> + * needed for physical replication, as it relies on the snapshot
> + * created by checkpoint when the base backup starts.
> + */
> + if (slot->data.database != InvalidOid)
> + {
> + /* make sure we have enough information to start */
> + flushptr = LogStandbySnapshot();
> +
> + /* and make sure it's fsynced to disk */
> + XLogFlush(flushptr);
> + }
> + }
> + else
> + slot->data.restart_lsn = GetRedoRecPtr();
> +
> + /* prevent WAL removal as fast as possible */
> + ReplicationSlotsComputeRequiredLSN();
> +
> + /*
> + * If all required WAL is still there, great, otherwise retry. The
> + * slot should prevent further removal of WAL, unless there's a
> + * concurrent ReplicationSlotsComputeRequiredLSN() after we've written
> + * the new restart_lsn above, so normally we should never need to loop
> + * more than twice.
> + */
> + XLByteToSeg(slot->data.restart_lsn, segno);
> + if (XLogGetLastRemovedSegno() < segno)
> + break;
> + }
> +}

The way you added the check for logical vs. physical slots in there
looks wrong to me. For a physical slot created !InRecovy we'll possibly
return a xlog position from the future (it's the insert position *and*
not flushed to disk), which then cannot be received.

> +/*
> * Flush all replication slots to disk.
> *
> * This needn't actually be part of a checkpoint, but it's a convenient
> @@ -876,7 +942,7 @@ StartupReplicationSlots(void)
> }
>
> /* ----
> - * Manipulation of ondisk state of replication slots
> + * Manipulation of on-disk state of replication slots
> *
> * NB: none of the routines below should take any notice whether a slot is the
> * current one or not, that's all handled a layer above.
> diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
> index 9a2793f..01b376a 100644
> --- a/src/backend/replication/slotfuncs.c
> +++ b/src/backend/replication/slotfuncs.c
> @@ -40,6 +40,7 @@ Datum
> pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
> {
> Name name = PG_GETARG_NAME(0);
> + bool immediately_reserve = PG_GETARG_BOOL(1);
> Datum values[2];
> bool nulls[2];
> TupleDesc tupdesc;
> @@ -58,10 +59,28 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
> /* acquire replication slot, this will check for conflicting names */
> ReplicationSlotCreate(NameStr(*name), false, RS_PERSISTENT);
>
> - values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> + if (immediately_reserve)
> + {
> + /* Allocate restart-LSN, if the user asked for it */
> + ReplicationSlotRegisterRestartLSN();
> +
> + /* Write this slot to disk */
> + ReplicationSlotMarkDirty();
> + ReplicationSlotSave();
>
> - nulls[0] = false;
> - nulls[1] = true;
> + values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> + values[1] = LSNGetDatum(MyReplicationSlot->data.restart_lsn);
> +
> + nulls[0] = false;
> + nulls[1] = false;
> + }
> + else
> + {
> + values[0] = NameGetDatum(&MyReplicationSlot->data.name);
> +
> + nulls[0] = false;
> + nulls[1] = true;
> + }

I moved
values[0] = NameGetDatum(&MyReplicationSlot->data.name);
nulls[0] = false;
to outside the conditional block, there seems no reason to have it in
there?

I also removed a bunch of unrelated minor cleanups that I plan to commit
& backpatch separately.

What do you think?

Andres

Attachment Content-Type Size
0001-Introduce-macros-determining-if-a-replication-slot-i.patch text/x-patch 4.0 KB
0002-Allow-pg_create_physical_replication_slot-to-reserve.patch text/x-patch 11.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2015-08-10 14:32:26 Re: tap tests remove working directories
Previous Message Tom Lane 2015-08-10 14:08:32 Re: Using contrib modules in check (Re: pgsql: Fix BRIN to use SnapshotAny during summarization)