Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-22 05:01:34
Message-ID: CAJpy0uC=h4inQ41RVDffEYMfKYNDcP=MoQdLGgfA487UmVLt6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 21, 2024 at 5:19 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> A few minor comments:

Thanks for the feedback.

> =================
> 1.
> +/*
> + * Is stopSignaled set in SlotSyncCtx?
> + */
> +bool
> +IsStopSignaledSet(void)
> +{
> + bool signaled;
> +
> + SpinLockAcquire(&SlotSyncCtx->mutex);
> + signaled = SlotSyncCtx->stopSignaled;
> + SpinLockRelease(&SlotSyncCtx->mutex);
> +
> + return signaled;
> +}
> +
> +/*
> + * Reset stopSignaled in SlotSyncCtx.
> + */
> +void
> +ResetStopSignaled(void)
> +{
> + SpinLockAcquire(&SlotSyncCtx->mutex);
> + SlotSyncCtx->stopSignaled = false;
> + SpinLockRelease(&SlotSyncCtx->mutex);
> +}
>
> I think these newly introduced functions don't need spinlock to be
> acquired as these are just one-byte read-and-write. Additionally, when
> IsStopSignaledSet() is invoked, there shouldn't be any concurrent
> process to update that value. What do you think?

Yes, we can avoid taking spinlock here. These functions are invoked
after checking that pmState is PM_RUN. And in that state we do not
expect any other process writing this flag.

> 2.
> +REPL_SLOTSYNC_MAIN "Waiting in main loop of slot sync worker."
> +REPL_SLOTSYNC_SHUTDOWN "Waiting for slot sync worker to shut down."
>
> Let's use REPLICATION instead of REPL. I see other wait events using
> REPLICATION in their names.

Modified.

> 3.
> - * In standalone mode and in autovacuum worker processes, we use a fixed
> - * ID, otherwise we figure it out from the authenticated user name.
> + * In standalone mode, autovacuum worker processes and slot sync worker
> + * process, we use a fixed ID, otherwise we figure it out from the
> + * authenticated user name.
> */
> - if (bootstrap || IsAutoVacuumWorkerProcess())
> + if (bootstrap || IsAutoVacuumWorkerProcess() || IsLogicalSlotSyncWorker())
> {
> InitializeSessionUserIdStandalone();
> am_superuser = true;
>
> IIRC, we discussed this previously and it is safe to make the local
> connection as superuser as we don't consult any user tables, so we can
> probably add a comment where we invoke InitPostgres in slotsync.c

Added comment. Thanks Hou-San for the analysis here and providing comment.

> 4.
> $publisher->safe_psql('postgres',
> - "CREATE PUBLICATION regress_mypub FOR ALL TABLES;");
> + "CREATE PUBLICATION regress_mypub FOR ALL TABLES;"
> +);
>
> Why this change is required in the patch?

Not needed, removed it.

> 5.
> +# Confirm that restart_lsn and of confirmed_flush_lsn lsub1_slot slot
> are synced
> +# to the standby
>
> /and of/; looks like a typo

Modified.

> 6.
> +# Confirm that restart_lsn and of confirmed_flush_lsn lsub1_slot slot
> are synced
> +# to the standby
> +ok( $standby1->poll_query_until(
> + 'postgres',
> + "SELECT '$primary_restart_lsn' = restart_lsn AND
> '$primary_flush_lsn' = confirmed_flush_lsn from pg_replication_slots
> WHERE slot_name = 'lsub1_slot';"),
> + 'restart_lsn and confirmed_flush_lsn of slot lsub1_slot synced to standby');
> +
> ...
> ...
> +# Confirm the synced slot 'lsub1_slot' is retained on the new primary
> +is($standby1->safe_psql('postgres',
> + q{SELECT slot_name FROM pg_replication_slots WHERE slot_name =
> 'lsub1_slot';}),
> + 'lsub1_slot',
> + 'synced slot retained on the new primary');
>
> In both these checks, we should additionally check the 'synced' and
> 'temporary' flags to ensure that they are marked appropriately.

Modified.

Please find patch001 attached. There is a CFBot failure in patch002.
The test added there needs some adjustment. We will rebase and post
rest of the patches once we fix that issue.

thanks
Shveta

Attachment Content-Type Size
v94-0001-Add-a-new-slotsync-worker.patch application/octet-stream 60.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajith Rao .B(App Software) 2024-02-22 05:12:28 Porting PostgresSQL libraries for QNX710
Previous Message Robert Haas 2024-02-22 04:47:12 Re: LogwrtResult contended spinlock