Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-09-19 04:59:26
Message-ID: CAJpy0uDAVi9sWeFAcqupXc=51UBTVZd-e_DjC6eyM=frTsM1jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 15, 2023 at 2:22 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi. Here are some review comments for v17-0002.
>

Thanks Peter for the feedback. I have addressed most of these in v18
except 2. Please find my comments for the ones not addressed.

> This is a WIP and a long way from complete, but I wanted to send what
> I have so far (while it is still current with your latest posted
> patches).
>
> ======

>
> 34. ListSlotDatabaseOIDs - sorting/logic
>
> Maybe explain better the reason for having the qsort and other logic.
>
> TBH, I was not sure of the necessity for the names lists and the
> sorting and bsearch logic. AFAICT these are all *only* used to check
> for uniqueness and existence of the slot name. So I was wondering if a
> hashmap keyed by the slot name might be more appropriate and also
> simpler than all this list sorting/searching.
>

Pending. I will revisit this soon and will let you know more on this.
IMO, it was done to optimize the search as slot_names list could be
pretty huge if max_replication_slots is set to high value.

> ~~
>
> 35. ListSlotDatabaseOIDs
>
> + for (int slotno = 0; slotno < max_replication_slots; slotno++)
> + {
>
> loop variable declaration
>
>
> ======
> src/backend/storage/lmgr/lwlock.c
> OK
>
> ======
> src/backend/storage/lmgr/lwlocknames.txt
> OK
>
> ======
> .../utils/activity/wait_event_names.txt
> TODO
>
> ======
> src/backend/utils/misc/guc_tables.c
> OK
>
> ======
> src/backend/utils/misc/postgresql.conf.sample
>
> 36.
> # primary to streaming replication standby server
> +#max_slotsync_workers = 2 # max number of slot synchronization
> workers on a standby
>
> IMO it is better to say "maximum" instead of "max" in the comment.
>
> (make sure the GUC description text is identical)
>
> ======
> src/include/catalog/pg_proc.dat
>
> 37.
> +{ oid => '6312', descr => 'get invalidate cause of a replication slot',
> + proname => 'pg_get_invalidation_cause', provolatile => 's',
> proisstrict => 't',
> + prorettype => 'int2', proargtypes => 'name',
> + prosrc => 'pg_get_invalidation_cause' },
>
> 37a.
> SUGGESTION (descr)
> what caused the replication slot to become invalid
>
> ~
>
> 37b
> 'pg_get_invalidation_cause' seemed like a poor function name because
> it doesn't have any context -- not even the word "slot" in it.
>
> ======
> src/include/commands/subscriptioncmds.h
> OK
>
> ======
> src/include/nodes/replnodes.h
> OK
>
> ======
> src/include/postmaster/bgworker_internals.h
>
> 38.
> #define MAX_PARALLEL_WORKER_LIMIT 1024
> +#define MAX_SLOT_SYNC_WORKER_LIMIT 50
>
> Consider SLOTSYNC instead of SLOT_SYNC for consistency with other
> names of this worker.
>
> ======
> OK
>
> ======
> src/include/replication/logicalworker.h
>
> 39.
> extern void ApplyWorkerMain(Datum main_arg);
> extern void ParallelApplyWorkerMain(Datum main_arg);
> extern void TablesyncWorkerMain(Datum main_arg);
> +extern void ReplSlotSyncMain(Datum main_arg);
>
> The name is not consistent with others nearby. At least it should
> include the word "Worker" like everything else does.
>
> ======
> src/include/replication/slot.h
> OK
>
> ======
> src/include/replication/walreceiver.h
>
> 40.
> +/*
> + * Slot's DBid related data
> + */
> +typedef struct WalRcvRepSlotDbData
> +{
> + Oid database; /* Slot's DBid received from remote */
> + TimestampTz last_sync_time; /* The last time we tried to launch sync
> + * worker for above Dbid */
> +} WalRcvRepSlotDbData;
> +
>
>
> Is that comment about field 'last_sync_time' correct? I thought this
> field is the last time the slot was synced -- not the last time the
> worker was launched.

Sorry for confusion. Comment is correct, the name is misleading. I
have changed the name in v18.

>
> ======
> src/include/replication/worker_internal.h
>
> 41.
> - /* User to use for connection (will be same as owner of subscription). */
> + /* User to use for connection (will be same as owner of subscription
> + * in case of LogicalRep worker). */
> Oid userid;
> +} WorkerHeader;
>
> 41a.
>
> This is not the normal style for a multi-line comment.
>
> ~
>
> 41b.
> I wondered if the name "WorkerHeader" is just a bit *too* generic and
> might cause future trouble because of the vague name.
>

I agree. Can you please suggest a better name for it?

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-19 05:43:37 Re: Move global variables of pgoutput to plugin private scope.
Previous Message shveta malik 2023-09-19 04:50:55 Re: Synchronizing slots from primary to standby