Re: speed up a logical replica setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: speed up a logical replica setup
Date: 2024-01-09 05:19:28
Message-ID: CAA4eK1Kh5dNPXtX0RqMt07TtEkJsqW65gR2Lyt7yPBAo9Ausiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 8, 2024 at 12:35 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > On Fri, Jan 5, 2024 at 3:36 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > > I love your proposal, so I want to join the review. Here are my first comments.
> > >
> > > 01.
> > > Should we restrict that `--subscriber-conninfo` must not have hostname or IP?
> > > We want users to execute pg_subscriber on the target, right?
> > >
> >
> > I don't see any harm in users giving those information but we should
> > have some checks to ensure that the server is in standby mode and is
> > running locally. The other related point is do we need to take input
> > for the target cluster directory from the user? Can't we fetch that
> > information once we are connected to standby?
>
> I think that functions like inet_client_addr() may be able to use, but it returns
> NULL only when the connection is via a Unix-domain socket. Can we restrict
> pg_subscriber to use such a socket?
>

Good question. So, IIUC, this tool has a requirement to run locally
where standby is present because we want to write reconvery.conf file.
I am not sure if it is a good idea to have a restriction to use only
the unix domain socket as users need to set up the standby for that by
configuring unix_socket_directories. It is fine if we can't ensure
that it is running locally but we should at least ensure that the
server is a physical standby node to avoid the problems as Shlok has
reported.

On a related point, I see that the patch stops the standby server (if
it is running) before starting with subscriber-side steps. I was
wondering if users can object to it that there was some important data
replication in progress which this tool has stopped. Now, OTOH,
anyway, once the user uses pg_subscriber, the standby server will be
converted to a subscriber, so it may not be useful as a physical
replica. Do you or others have any thoughts on this matter?

> > >
> > > 05.
> > > I found that the connection string for each subscriptions have a setting
> > > "fallback_application_name=pg_subscriber". Can we remove it?
> > >
> > > ```
> > > postgres=# SELECT subconninfo FROM pg_subscription;
> > > subconninfo
> > >
> > ---------------------------------------------------------------------------------
> > > user=postgres port=5431 fallback_application_name=pg_subscriber
> > dbname=postgres
> > > (1 row)
> > > ```
> >
> > Can that help distinguish the pg_subscriber connection on the publisher?
> >
>
> Note that this connection string is used between the publisher instance and the
> subscriber instance (not pg_subscriber client application). Also, the
> fallback_application_name would be replaced to the name of subscriber in
> run_apply_worker()->walrcv_connect(). Actually the value would not be used.
>

Fair point. It is not clear what other purpose this can achieve,
probably Euler has something in mind for this.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-01-09 05:24:19 Re: Commitfest 2024-01 first week update
Previous Message Michael Paquier 2024-01-09 04:59:08 Re: Add a perl function in Cluster.pm to generate WAL