Re: Synchronizing slots from primary to standby

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-12-27 06:06:13
Message-ID: CAD21AoD23jkCwvHkrGOyXRs5Ugh_SF0g0-aJpNTekCT0HHYcrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thank you for working on this.

On Tue, Dec 26, 2023 at 9:27 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Dec 26, 2023 at 4:41 PM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Wednesday, December 20, 2023 7:37 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Dec 20, 2023 at 3:29 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> > > wrote:
> > > >
> > > > On Wed, Dec 20, 2023 at 9:12 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > > wrote:
> > > > >
> > > > > On Tue, Dec 19, 2023 at 5:30 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
> > > wrote:
> > > > > >
> > > > > > Thanks for reviewing. I have addressed these in v50.
> > > > > >
> > > > >
> > > > > I was looking at this patch to see if something smaller could be
> > > > > independently committable. I think we can extract
> > > > > pg_get_slot_invalidation_cause() and commit it as that function
> > > > > could be independently useful as well. What do you think?
> > > > >
> > > >
> > > > Sure, forked another thread [1]
> > > > [1]:
> > > >
> > > https://www.postgresql.org/message-id/CAJpy0uBpr0ym12%2B0mXpjcRFA6
> > > N%3D
> > > > anX%2BYk9aGU4EJhHNu%3DfWykQ%40mail.gmail.com
> > > >
> > >
> > > Thanks, thinking more, we can split the patch into the following three patches
> > > which can be committed separately (a) Allowing the failover property to be set
> > > for a slot via SQL API and subscription commands
> > > (b) sync slot worker infrastructure (c) GUC standby_slot_names and the the
> > > corresponding wait logic in server-side.
> > >
> > > Thoughts?
> >
> > I agree. Here is the V54 patch set which was split based on the suggestion.
> > The commit message in each patch is also improved.
> >
>
> I would like to revisit the current dependency of slotsync worker on
> dbname used in 002 patch. Currently we accept dbname in
> primary_conninfo and thus the user has to make sure to provide one (by
> manually altering it) even in case of a conf file auto-generated by
> "pg_basebackup -R".
> Thus I would like to discuss if there are better ways to do it.
> Complete background is as follow:
>
> We need dbname for 2 purposes:
>
> 1) to connect to remote db in order to run SELECT queries to fetch the
> info needed by slotsync worker.
> 2) to make connection in slot-sync worker itself in order to be able
> to use libpq APIs for 1)
>
> We run 3 kind of select queries in slot-sync worker currently:
>
> a) To fetch all failover slots (logical slots) info at once in
> synchronize_slots().
> b) To fetch a particular slot info during
> wait_for_primary_slot_catchup() logic (logical slot).
> c) To validate primary slot (physical one) and also to distinguish
> between standby and cascading standby by running pg_is_in_recovery().
>
> 1) One approach to avoid dependency on dbname is using commands
> instead of SELECT. This will need implementing LIST_SLOTS command for
> a), and for b) we can use LIST_SLOTS and fetch everything (even though
> it is not needed) or have LIST_SLOTS with a filter on slot-name or
> extend READ_REPLICATION_SLOT, and for c) we can have some other
> command to get pg_is_in_recovery() info. But, I feel by relying on
> commands we will be making the extension of the slot-sync feature
> difficult. In future, if there is some more requirement to fetch any
> other info,
> then there too we have to implement a command. I am not sure if it is
> good and extensible approach.
>
> 2) Another way to avoid asking for a dbname in primary_conninfo is to
> use the default dbname internally. This brings us to two questions:
> 'How' and 'Which default db'?
>
> 2.1) To answer 'How':
> Using default dbname is simpler for the purpose of slot-sync worker
> having its own db-connection, but is a little tricky for the purpose
> of connection to remote_db. This is because we have to inject this
> dbname internally in our connection-info.
>
> 2.1.1) Say we use primary_conninfo (i.e. original one w/o dbname),
> then currently it could have 2 formats:
>
> a) The simple "=" format for key-value pairs, example:
> 'user=replication host=127.0.0.1 port=5433 dbname=postgres'.
> b) URI format, example:
> postgresql://other(at)localhost/otherdb?connect_timeout=10&application_name=myapp
>
> We can distinguish between the 2 formats using 'uri_prefix_length' but
> injecting the dbname part will be messy specially for URI format. If
> we want to do it w/o injecting and only by changing libpq interfaces
> to accept dbname separately apart from conninfo, then there is no
> current simpler way available. It will need a good amount of changes
> in libpq.
>
> 2.1.2) Another way is to not rely on primary_conninfo directly but
> rely on 'WalRcv->conninfo' in order to connect to remote_db. This is
> because the latter is never URI format, it is some parsed format and
> appending may work. As an example, primary_conninfo =
> 'postgresql://replication(at)localhost:5433', WalRcv->conninfo loaded
> internally is:
> "user=replication passfile=/home/shveta/.pgpass channel_binding=prefer
> dbname=replication host=localhost port=5433
> fallback_application_name=walreceiver sslmode=prefer sslcompression=0
> sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2
> gssencmode=disable krbsrvname=postgres gssdelegation=0
> target_session_attrs=any load_balance_hosts=disable", '\000'
>
> So we can try appending our default dbname to this. But all the
> defaults loaded in WalRcv->conninfo need some careful analysis to
> figure out if they work for slot-sync worker case.
>
> 2.2) Now coming to 'Which default db':
>
> 2.2.1) If we use 'template1' as default db, it may block 'create db'
> operations on primary for the time when the slot-sync worker is
> connected to remote using this dbname. Example:
>
> postgres=# create database newdb1;
> ERROR: source database "template1" is being accessed by other users
> DETAIL: There is 1 other session using the database.
>
> 2.2.2) If we use 'postgres' as default db, there are chances that it
> can be dropped as unlike 'template1', it is allowed to be dropped by
> user, and if slotsync worker is connected to it, user may see:
> newdb1=# drop database postgres;
> ERROR: database "postgres" is being accessed by other users
> DETAIL: There is 1 other session using the database.
>
> But once the slot-sync worker or standby goes down, user can always
> drop this and next time slot-sync worker may not be able to come up.
>

Other random ideas for discussion are:

3) The slotsync worker uses primary_conninfo but also uses a new GUC
parameter, say slot_sync_dbname, to specify the database to connect.
The slot_sync_dbname overwrites the dbname if primary_conninfo also
specifies it. If both don't have a dbname, raise an error.

4) The slotsync worker uses a new GUC parameter, say
slot_sync_conninfo, to specify the connection string to the primary
aside from primary_conninfo. And pg_basebackup -R generates
slot_sync_conninfo as well if required (new option required).

BTW given that the slotsync worker executes only normal SQL queries,
is there any reason why it uses a replication connection? It's
slightly odd to me that the pg_stat_replication view shows one entry
that remains in the "startup" state.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2023-12-27 06:31:31 Re: Tab complete for CREATE SUBSCRIPTION ... CONECTION does not work
Previous Message Tom Lane 2023-12-27 05:07:23 Re: POC: GROUP BY optimization