Re: Synchronizing slots from primary to standby

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-10 07:20:29
Message-ID: 538ddca6-cf74-4a9c-95d6-dd05af24070c@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 11/10/23 6:41 AM, Amit Kapila wrote:
> On Thu, Nov 9, 2023 at 7:29 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Are you saying that we change the state of the already existing slot
> on standby?

Yes.

> And, such a state would indicate that we are trying to
> sync the slot with the same name from the primary. Is that what you
> have in mind?

Yes.

> If so, it appears quite odd to me to have such a state
> and also set it in some unrelated slot that just has the same name.
>

> I understand your point that we can allow other slots to proceed but
> it is also important to not create any sort of inconsistency that can
> surprise user after failover.

But even if we ERROR out instead of emitting a WARNING, the user would still
need to be notified/monitor such errors. I agree that then probably they will
come to know earlier because the slot sync mechanism would be stopped but still
it is not "guaranteed" (specially if there is no others "working" synced slots
around.) And if they do not, then there is still a risk to use this slot after a
failover thinking this is a "synced" slot.

Giving more thoughts, what about using a dedicated/reserved naming convention for
synced slot like synced_<primary_slot_name> or such and then:

- prevent user to create sync_<whatever> slots on standby
- sync <slot> on primary to sync_<slot> on standby
- during failover, rename sync_<slot> to <slot> and if <slot> exists then
emit a WARNING and keep sync_<slot> in place.

That way both slots are still in place (the manually created <slot> and
the sync_<slot<) and one could decide what to do with them.

I don't think we'd need to worry about the cases where sync_ slot could be already
created before we "prevent" such slots creation. Indeed I think they would not survive
pg_upgrade before 17 -> 18 upgrades. So it looks like we'd be good as long as we
are able to prevent sync_ slots creation on 17.

Thoughts?

> Also, the current coding doesn't ensure
> we will always give WARNING. If we see the below code that deals with
> this WARNING,
>
> +  /* User created slot with the same name exists, emit WARNING. */
> +  else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE)
> +  {
> +    ereport(WARNING,
> +        errmsg("not synchronizing slot %s; it is a user created slot",
> +           remote_slot->name));
> +  }
> +  /* Otherwise create the slot first. */
> +  else
> +  {
> +    TransactionId xmin_horizon = InvalidTransactionId;
> +    ReplicationSlot *slot;
> +
> +    ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
> +               remote_slot->two_phase, false);
>
> I think this is not a solid check to ensure that the slot existed
> before. Because it could be created as soon as the slot sync worker
> invokes ReplicationSlotCreate() here.

Agree.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-11-10 07:38:21 Re: Remove MSVC scripts from the tree
Previous Message Amit Kapila 2023-11-10 05:41:24 Re: Synchronizing slots from primary to standby