From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Synchronizing slots from primary to standby |
Date: | 2023-11-10 08:15:39 |
Message-ID: | dd9dbbaf-ca77-423a-8d62-bfc814626b47@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 11/10/23 8:55 AM, Amit Kapila wrote:
> On Fri, Nov 10, 2023 at 12:50 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>
>> But even if we ERROR out instead of emitting a WARNING, the user would still
>> need to be notified/monitor such errors. I agree that then probably they will
>> come to know earlier because the slot sync mechanism would be stopped but still
>> it is not "guaranteed" (specially if there is no others "working" synced slots
>> around.)
>
>>
>> And if they do not, then there is still a risk to use this slot after a
>> failover thinking this is a "synced" slot.
>>
>
> I think this is another reason that probably giving ERROR has better
> chances for the user to notice before failover. IF knowing such errors
> user still proceeds with the failover, the onus is on her.
Agree. My concern is more when they don't know about the error.
> We can
> probably document this hazard along with the failover feature so that
> users are aware that they either need to be careful while creating
> slots on standby or consult ERROR logs. I guess we can even make it
> visible in the view also.
Yeah.
>> Giving more thoughts, what about using a dedicated/reserved naming convention for
>> synced slot like synced_<primary_slot_name> or such and then:
>>
>> - prevent user to create sync_<whatever> slots on standby
>> - sync <slot> on primary to sync_<slot> on standby
>> - during failover, rename sync_<slot> to <slot> and if <slot> exists then
>> emit a WARNING and keep sync_<slot> in place.
>>
>> That way both slots are still in place (the manually created <slot> and
>> the sync_<slot<) and one could decide what to do with them.
>>
>
> Hmm, I think after failover, users need to rename all slots or we need
> to provide a way to rename them so that they can be used by
> subscribers which sounds like much more work.
Agree that's much more work for the subscriber case. Maybe that's not worth
the extra work.
>>> Also, the current coding doesn't ensure
>>> we will always give WARNING. If we see the below code that deals with
>>> this WARNING,
>>>
>>> + /* User created slot with the same name exists, emit WARNING. */
>>> + else if (found && s->data.sync_state == SYNCSLOT_STATE_NONE)
>>> + {
>>> + ereport(WARNING,
>>> + errmsg("not synchronizing slot %s; it is a user created slot",
>>> + remote_slot->name));
>>> + }
>>> + /* Otherwise create the slot first. */
>>> + else
>>> + {
>>> + TransactionId xmin_horizon = InvalidTransactionId;
>>> + ReplicationSlot *slot;
>>> +
>>> + ReplicationSlotCreate(remote_slot->name, true, RS_EPHEMERAL,
>>> + remote_slot->two_phase, false);
>>>
>>> I think this is not a solid check to ensure that the slot existed
>>> before. Because it could be created as soon as the slot sync worker
>>> invokes ReplicationSlotCreate() here.
>>
>> Agree.
>>
>
> So, having a concrete check to give WARNING would require some more
> logic which I don't think is a good idea to handle this boundary case.
>
Yeah good point, agree to just error out in all the case then (if we discard
the sync_ reserved wording proposal, which seems to be the case as probably
not worth the extra work).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2023-11-10 08:29:34 | Re: Failure during Building Postgres in Windows with Meson |
Previous Message | Hayato Kuroda (Fujitsu) | 2023-11-10 07:59:49 | RE: MinGW compiler warnings in ecpg tests |