RE: Synchronizing slots from primary to standby

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'shveta malik' <shveta(dot)malik(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: RE: Synchronizing slots from primary to standby
Date: 2023-09-13 11:18:37
Message-ID: TYAPR01MB5866A5F20B784B38729B9EE6F5F0A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Shveta,

Sorry for the late response.

> Thanks Kuroda-san for the feedback.
> >
> > 01. General
> >
> > I think the documentation can be added, not only GUCs. How about adding
> examples
> > for combinations of physical and logical replications? You can say that both of
> > physical primary can be publisher and slots on primary/standby are
> synchronized.
> >
>
> I did not fully understand this. Can you please state a clear example.
> We are only synchronizing logical replication slots in this draft and
> that too on physical standby from primary. So the last statement is
> not completely true.

I expected to add a new subsection in "Log-Shipping Standby Servers". I think we
can add like following infos:

* logical replication publisher can be also replicated
* For that, a physical repliation slot must be defined on primar
* Then we can set up standby_slot_names(on primary) and synchronize_slot_names
(on both server).
* slots are synchronized automatically

> > 02. General
> >
> > standby_slot_names ensures that physical standby is always ahead subscriber,
> but I
> > think it may be not sufficient. There is a possibility that primary server does
> > not have any physical slots.So it expects a slot to be present.
> > In this case the physical standby may be behind the
> > subscriber and the system may be confused when the failover is occured.
>
> Currently there is a check in slot-sync worker which mandates that
> there is a physical slot present between primary and standby for this
> feature to proceed.So that confusion state will not arise.
> + /* WalRcvData is not set or primary_slot_name is not set yet */
> + if (!WalRcv || WalRcv->slotname[0] == '\0')
> + return naptime;

Right, but I wanted to know why it is needed. One motivation seemed to know the
WAL location of physical standby, but I thought that struct WalSnd.apply could
be also used. Is it bad to assume that the physical walsender always exists?

> >Can't we specify the name of standby via application_name or something?
>
> So do you mean that in absence of a physical slot (if we plan to
> support that), we let primary know about standby(slots-synchronization
> client) through application_name?

Yes, it is what I considered.

> >
> > 03. General
> >
> > In this architecture, the syncslot worker is launched per db and they
> > independently connects to primary, right?
>
> Not completely true. Each slotsync worker is responsible for managing
> N dbs. Here 'N' = 'Number of distinct dbs for slots in
> synchronize_slot_names'/ 'number of max_slotsync_workers configured'
> for cases where dbcount exceeds workers configured.
> And if dbcount < max_slotsync_workers, then we launch only that many
> workers equal to dbcount and each worker manages a single db. Each
> worker independently connects to primary. Currently it makes a
> connection multiple times, I am optimizing it to make connection only
> once and then after each SIGHUP assuming 'primary_conninfo' may
> change. This change will be in the next version.
>
>
> >I'm not sure it is efficient, but I
> > come up with another architecture - only a worker (syncslot receiver)connects
> > to the primary and other workers (syncslot worker) receives infos from it and
> > updates. This can reduce the number of connections so that it may slightly
> > improve the latency of network. How do you think?
> >
>
> I feel it may help in reducing network latency, but not sure if it
> could be more efficient in keeping the lsns in sync. I feel it may
> introduce lag due to the fact that only one worker is getting all the
> info from primary and the actual synchronizing workers are waiting on
> that worker. This lag may be more when the number of slots are huge.
> We have run some performance tests on the design implemented
> currently, please have a look at emails around [1] and [2].

Thank you for teaching! Yeah, I agreed that another point might be a bottleneck.
It could be recalled in future, but currently we do not have to consider...

> > 04. ReplSlotSyncMain
> >
> > Does the worker have to connect to the specific database?
> >
> >
> > ```
> > /* Connect to our database. */
> >
> BackgroundWorkerInitializeConnectionByOid(MySlotSyncWorker->dbid,
> >
> MySlotSyncWorker->userid,
> >
> 0);
> > ```
>
> Since we are using libpq public interface 'walrcv_exec=libpqrcv_exec'
> to connect to primary, this needs database connection. It errors out
> in the absence of 'MyDatabaseId'. Do you think db-connection can have
> some downsides?
>

I considered that we should not grant privileges to access data more than necessary.
It might be better if we can avoid to connect to the specific database. But I'm
not sure that we should have to add new walreceiver API to handle it. FYI, I
checked the physical walreceiver to refer it, but it was not background worker
so that it was no meaning.

And followings are further comments.

1.
I considered the combination with the feature and initial data sync, and found an
issue. How do you think? Assuming that the name of subscription is specified as
"synchronize_slot_names".

A synchronization of each tables is separated into two transactions:

1. In a first transaction, a logical replication slot (pg_XXX_sync_XXX...)is
created and tuples are COPYd.
2. In a second transaction, changes from the first transaction are streamed by
and applied.

If the primary crashed between 1 and 2 and standby is promoted, the tablesync
worker would execute "START_REPLICATION SLOT pg_XXX_sync_XXX..." to promoted
server, but fail because such a slot does not exist.

Is this a problem we should solve? Above can be reproduced by adding sleep().

2.
Do we have to add some rules in "Configuration Settings" section?

3.
You can run pgindent in your timing.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-09-13 11:19:38 Re: Have better wording for snapshot file reading failure
Previous Message Jehan-Guillaume de Rorthais 2023-09-13 11:09:04 Re: Detoasting optionally to make Explain-Analyze less misleading