Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-09-21 03:46:29
Message-ID: CAJpy0uAsuzvg7JVc48UBUNa9ayqoa24PU7kM6U2_MfrF8p-prQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 19, 2023 at 10:29 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Fri, Sep 15, 2023 at 2:22 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi. Here are some review comments for v17-0002.
> >
>
> Thanks Peter for the feedback. I have addressed most of these in v18
> except 2. Please find my comments for the ones not addressed.
>
> > This is a WIP and a long way from complete, but I wanted to send what
> > I have so far (while it is still current with your latest posted
> > patches).
> >
> > ======
>
> >
> > 34. ListSlotDatabaseOIDs - sorting/logic
> >
> > Maybe explain better the reason for having the qsort and other logic.
> >
> > TBH, I was not sure of the necessity for the names lists and the
> > sorting and bsearch logic. AFAICT these are all *only* used to check
> > for uniqueness and existence of the slot name. So I was wondering if a
> > hashmap keyed by the slot name might be more appropriate and also
> > simpler than all this list sorting/searching.
> >
>
> Pending. I will revisit this soon and will let you know more on this.
> IMO, it was done to optimize the search as slot_names list could be
> pretty huge if max_replication_slots is set to high value.
>
> > ~~
> >
> > 35. ListSlotDatabaseOIDs
> >
> > + for (int slotno = 0; slotno < max_replication_slots; slotno++)
> > + {
> >
> > loop variable declaration
> >
> >
> > ======
> > src/backend/storage/lmgr/lwlock.c
> > OK
> >
> > ======
> > src/backend/storage/lmgr/lwlocknames.txt
> > OK
> >
> > ======
> > .../utils/activity/wait_event_names.txt
> > TODO
> >
> > ======
> > src/backend/utils/misc/guc_tables.c
> > OK
> >
> > ======
> > src/backend/utils/misc/postgresql.conf.sample
> >
> > 36.
> > # primary to streaming replication standby server
> > +#max_slotsync_workers = 2 # max number of slot synchronization
> > workers on a standby
> >
> > IMO it is better to say "maximum" instead of "max" in the comment.
> >
> > (make sure the GUC description text is identical)
> >
> > ======
> > src/include/catalog/pg_proc.dat
> >
> > 37.
> > +{ oid => '6312', descr => 'get invalidate cause of a replication slot',
> > + proname => 'pg_get_invalidation_cause', provolatile => 's',
> > proisstrict => 't',
> > + prorettype => 'int2', proargtypes => 'name',
> > + prosrc => 'pg_get_invalidation_cause' },
> >
> > 37a.
> > SUGGESTION (descr)
> > what caused the replication slot to become invalid
> >
> > ~
> >
> > 37b
> > 'pg_get_invalidation_cause' seemed like a poor function name because
> > it doesn't have any context -- not even the word "slot" in it.
> >
> > ======
> > src/include/commands/subscriptioncmds.h
> > OK
> >
> > ======
> > src/include/nodes/replnodes.h
> > OK
> >
> > ======
> > src/include/postmaster/bgworker_internals.h
> >
> > 38.
> > #define MAX_PARALLEL_WORKER_LIMIT 1024
> > +#define MAX_SLOT_SYNC_WORKER_LIMIT 50
> >
> > Consider SLOTSYNC instead of SLOT_SYNC for consistency with other
> > names of this worker.
> >
> > ======
> > OK
> >
> > ======
> > src/include/replication/logicalworker.h
> >
> > 39.
> > extern void ApplyWorkerMain(Datum main_arg);
> > extern void ParallelApplyWorkerMain(Datum main_arg);
> > extern void TablesyncWorkerMain(Datum main_arg);
> > +extern void ReplSlotSyncMain(Datum main_arg);
> >
> > The name is not consistent with others nearby. At least it should
> > include the word "Worker" like everything else does.
> >
> > ======
> > src/include/replication/slot.h
> > OK
> >
> > ======
> > src/include/replication/walreceiver.h
> >
> > 40.
> > +/*
> > + * Slot's DBid related data
> > + */
> > +typedef struct WalRcvRepSlotDbData
> > +{
> > + Oid database; /* Slot's DBid received from remote */
> > + TimestampTz last_sync_time; /* The last time we tried to launch sync
> > + * worker for above Dbid */
> > +} WalRcvRepSlotDbData;
> > +
> >
> >
> > Is that comment about field 'last_sync_time' correct? I thought this
> > field is the last time the slot was synced -- not the last time the
> > worker was launched.
>
> Sorry for confusion. Comment is correct, the name is misleading. I
> have changed the name in v18.
>
> >
> > ======
> > src/include/replication/worker_internal.h
> >
> > 41.
> > - /* User to use for connection (will be same as owner of subscription). */
> > + /* User to use for connection (will be same as owner of subscription
> > + * in case of LogicalRep worker). */
> > Oid userid;
> > +} WorkerHeader;
> >
> > 41a.
> >
> > This is not the normal style for a multi-line comment.
> >
> > ~
> >
> > 41b.
> > I wondered if the name "WorkerHeader" is just a bit *too* generic and
> > might cause future trouble because of the vague name.
> >
>
> I agree. Can you please suggest a better name for it?
>
>
> thanks
> Shveta

Currently in patch001, synchronize_slot_names is a GUC on both primary
and physical standby. This GUC tells which all logical slots need to
be synced on physical standbys from the primary. Ideally it should be
a GUC on physical standby alone and each physical standby should be
able to communicate the value to the primary (considering the value
may vary for different physical replicas of the same primary). The
primary on the other hand should be able to take UNION of these values
and let the logical walsenders (belonging to the slots in UNION
synchronize_slots_names) wait for physical standbys for confirmation
before sending those changes to logical subscribers. The intent is
logical subscribers should never be ahead of physical standbys.

So in order to implement this i.e. each physical standby communicating
synchronize_slot_names individually to primary, we need to maintain
the resultant/union value in shared-memory on primary so that each of
the logical walsenders can read these values. For the sake of less
complexity around this involved shared-memory, it will be good to have
synchronize_slot_names as PGC_POSTMASTER GUC parameter on physical
standby rather than a PGC_SIGHUP one. Making it PGC_POSTMASTER on
physical standby will make primary aware of the fact that slot-sync
connection from physical standby is going down and thus we now need to
invalidate the UNION synchronize_slot_names and compute it fresh from
rest of the connections from physical-standby's which are still alive.
Also when any new connection for slot-sync purpose comes from physical
standby, we need to compute the UNION synchronize_slot_names list
again. The synchronize_slots_names invalidation mechanism on primary
will become connection based.

Any thoughts? Does PGC_POSTMASTER over PGC_SIGHUP seem a reasonable choice here?

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-09-21 04:18:55 Re: Bug fix in vacuumdb --buffer-usage-limit xxx -Z
Previous Message Bowen Shi 2023-09-21 03:45:06 Re: Requiring recovery.signal or standby.signal when recovering with a backup_label