Re: speed up a logical replica setup

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: speed up a logical replica setup
Date: 2024-01-09 09:25:37
Message-ID: CAA4eK1Kf+JKG4Je2O_TpaT3rKazmDGAr5uVAqAgrwKsp8vYWFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 9, 2024 at 12:31 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> > > > I don't see any harm in users giving those information but we should
> > > > have some checks to ensure that the server is in standby mode and is
> > > > running locally. The other related point is do we need to take input
> > > > for the target cluster directory from the user? Can't we fetch that
> > > > information once we are connected to standby?
> > >
> > > I think that functions like inet_client_addr() may be able to use, but it returns
> > > NULL only when the connection is via a Unix-domain socket. Can we restrict
> > > pg_subscriber to use such a socket?
> > >
> >
> > Good question. So, IIUC, this tool has a requirement to run locally
> > where standby is present because we want to write reconvery.conf file.
> > I am not sure if it is a good idea to have a restriction to use only
> > the unix domain socket as users need to set up the standby for that by
> > configuring unix_socket_directories. It is fine if we can't ensure
> > that it is running locally but we should at least ensure that the
> > server is a physical standby node to avoid the problems as Shlok has
> > reported.
>
> While thinking more about it, I found that we did not define the policy
> whether user must not connect to the target while running pg_subscriber. What
> should be? If it should be avoided, some parameters like listen_addresses and
> unix_socket_permissions should be restricted like start_postmaster() in
> pg_upgrade/server.c.
>

Yeah, this makes sense to me.

> Also, the port number should be changed to another value
> as well.
>

Fair point, but I think in that case we should take this as one of the
parameters.

> Personally, I vote to reject connections during the pg_subscriber.
>
> > On a related point, I see that the patch stops the standby server (if
> > it is running) before starting with subscriber-side steps. I was
> > wondering if users can object to it that there was some important data
> > replication in progress which this tool has stopped. Now, OTOH,
> > anyway, once the user uses pg_subscriber, the standby server will be
> > converted to a subscriber, so it may not be useful as a physical
> > replica. Do you or others have any thoughts on this matter?
>
> I assumed that connections should be closed before running pg_subscriber. If so,
> it may be better to just fail the command when the physical standby has already
> been started. There is no answer whether data replication and user queries
> should stop. Users should choose the stop option based on their policy and then
> pg_subscriebr can start postmaster.
> pg_upgrade does the same thing in setup().
>

Agreed.

> ====
>
> Further comment:
> According to the doc, currently pg_subscriber is listed in the client application.
> But based on the definition, I felt it should be at "PostgreSQL Server Applications"
> page. How do you think?
>

I also think it should be a server application.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-01-09 09:31:20 Re: Tidy fill hstv array (src/backend/access/heap/pruneheap.c)
Previous Message Alexander Korotkov 2024-01-09 09:17:28 Re: collect_corrupt_items_vacuum.patch