Re: Synchronizing slots from primary to standby

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-09-18 10:22:55
Message-ID: CAJpy0uD7SgVZ1AO9JZ3LVk-NA=FXfLHP2SZO3ijL__5n1honpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2023 at 4:48 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Shveta,
>
> Sorry for the late response.
>
> > Thanks Kuroda-san for the feedback.
> > >
> > > 01. General
> > >
> > > I think the documentation can be added, not only GUCs. How about adding
> > examples
> > > for combinations of physical and logical replications? You can say that both of
> > > physical primary can be publisher and slots on primary/standby are
> > synchronized.
> > >
> >
> > I did not fully understand this. Can you please state a clear example.
> > We are only synchronizing logical replication slots in this draft and
> > that too on physical standby from primary. So the last statement is
> > not completely true.
>
> I expected to add a new subsection in "Log-Shipping Standby Servers". I think we
> can add like following infos:
>
> * logical replication publisher can be also replicated
> * For that, a physical repliation slot must be defined on primar
> * Then we can set up standby_slot_names(on primary) and synchronize_slot_names
> (on both server).
> * slots are synchronized automatically

Sure. I am trying to find the right place in this section to add this
info. I will try to address this in coming versions.

>
> > > 02. General
> > >
> > > standby_slot_names ensures that physical standby is always ahead subscriber,
> > but I
> > > think it may be not sufficient. There is a possibility that primary server does
> > > not have any physical slots.So it expects a slot to be present.
> > > In this case the physical standby may be behind the
> > > subscriber and the system may be confused when the failover is occured.
> >
> > Currently there is a check in slot-sync worker which mandates that
> > there is a physical slot present between primary and standby for this
> > feature to proceed.So that confusion state will not arise.
> > + /* WalRcvData is not set or primary_slot_name is not set yet */
> > + if (!WalRcv || WalRcv->slotname[0] == '\0')
> > + return naptime;
>
> Right, but I wanted to know why it is needed. One motivation seemed to know the
> WAL location of physical standby, but I thought that struct WalSnd.apply could
> be also used. Is it bad to assume that the physical walsender always exists?
>

We do not plan to target this case where physical slot is not created
between primary and physical-standby in the first draft. In such a
case, slot-synchronization will be skipped for the time being. We can
extend this functionality (if needed) later.

> > >Can't we specify the name of standby via application_name or something?
> >
> > So do you mean that in absence of a physical slot (if we plan to
> > support that), we let primary know about standby(slots-synchronization
> > client) through application_name?
>
> Yes, it is what I considered.
>
> > >
> > > 03. General
> > >
> > > In this architecture, the syncslot worker is launched per db and they
> > > independently connects to primary, right?
> >
> > Not completely true. Each slotsync worker is responsible for managing
> > N dbs. Here 'N' = 'Number of distinct dbs for slots in
> > synchronize_slot_names'/ 'number of max_slotsync_workers configured'
> > for cases where dbcount exceeds workers configured.
> > And if dbcount < max_slotsync_workers, then we launch only that many
> > workers equal to dbcount and each worker manages a single db. Each
> > worker independently connects to primary. Currently it makes a
> > connection multiple times, I am optimizing it to make connection only
> > once and then after each SIGHUP assuming 'primary_conninfo' may
> > change. This change will be in the next version.
> >
> >
> > >I'm not sure it is efficient, but I
> > > come up with another architecture - only a worker (syncslot receiver)connects
> > > to the primary and other workers (syncslot worker) receives infos from it and
> > > updates. This can reduce the number of connections so that it may slightly
> > > improve the latency of network. How do you think?
> > >
> >
> > I feel it may help in reducing network latency, but not sure if it
> > could be more efficient in keeping the lsns in sync. I feel it may
> > introduce lag due to the fact that only one worker is getting all the
> > info from primary and the actual synchronizing workers are waiting on
> > that worker. This lag may be more when the number of slots are huge.
> > We have run some performance tests on the design implemented
> > currently, please have a look at emails around [1] and [2].
>
> Thank you for teaching! Yeah, I agreed that another point might be a bottleneck.
> It could be recalled in future, but currently we do not have to consider...
>
> > > 04. ReplSlotSyncMain
> > >
> > > Does the worker have to connect to the specific database?
> > >
> > >
> > > ```
> > > /* Connect to our database. */
> > >
> > BackgroundWorkerInitializeConnectionByOid(MySlotSyncWorker->dbid,
> > >
> > MySlotSyncWorker->userid,
> > >
> > 0);
> > > ```
> >
> > Since we are using libpq public interface 'walrcv_exec=libpqrcv_exec'
> > to connect to primary, this needs database connection. It errors out
> > in the absence of 'MyDatabaseId'. Do you think db-connection can have
> > some downsides?
> >
>
> I considered that we should not grant privileges to access data more than necessary.
> It might be better if we can avoid to connect to the specific database. But I'm
> not sure that we should have to add new walreceiver API to handle it. FYI, I
> checked the physical walreceiver to refer it, but it was not background worker
> so that it was no meaning.
>

If this needs to be done, we need to have new walreceiver APIs around
that (no db connection) which are currently exposed to front-end
through libpq-fe.h but not exposed to the backend. I am not sure about
the feasibility for that and the effort needed there. So currently we
plan to go by db-connection as allowed by current libpq-walreceiver
APIs.

>
> And followings are further comments.
>
> 1.
> I considered the combination with the feature and initial data sync, and found an
> issue. How do you think? Assuming that the name of subscription is specified as
> "synchronize_slot_names".
>
> A synchronization of each tables is separated into two transactions:
>
> 1. In a first transaction, a logical replication slot (pg_XXX_sync_XXX...)is
> created and tuples are COPYd.
> 2. In a second transaction, changes from the first transaction are streamed by
> and applied.
>
> If the primary crashed between 1 and 2 and standby is promoted, the tablesync
> worker would execute "START_REPLICATION SLOT pg_XXX_sync_XXX..." to promoted
> server, but fail because such a slot does not exist.
>
> Is this a problem we should solve? Above can be reproduced by adding sleep().

I will try to reproduce this scenario to understand it better. Allow
me some more time.

>
> 2.
> Do we have to add some rules in "Configuration Settings" section?

Sure. I will review this further and see if we can add anything there.

>
> 3.
> You can run pgindent in your timing.
>

I have done it in v17.

>
> Best Regards,
> Hayato Kuroda
> FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-09-18 10:49:24 Re: Sync scan & regression tests
Previous Message Amit Langote 2023-09-18 10:20:00 Re: remaining sql/json patches