From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Subject: | Re: Synchronizing slots from primary to standby |
Date: | 2024-02-12 10:03:20 |
Message-ID: | Zcns6KGU1H/ipp2+@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Sun, Feb 11, 2024 at 01:23:19PM +0000, Zhijie Hou (Fujitsu) wrote:
> On Saturday, February 10, 2024 9:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Feb 10, 2024 at 5:31 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> > wrote:
> > >
> > > On Fri, Feb 9, 2024 at 4:08 PM Zhijie Hou (Fujitsu)
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > >
> > > > Another alternative is to register the callback when calling
> > > > slotsync functions and unregister it after the function call. And
> > > > register the callback in
> > > > slotsyncworkmain() for the slotsync worker patch, although this may
> > > > adds a few more codes.
> > >
> > > Another idea is that SyncReplicationSlots() calls synchronize_slots()
> > > in PG_ENSURE_ERROR_CLEANUP() block instead of PG_TRY(), to make sure
> > > to clear the flag in case of ERROR or FATAL. And the slotsync worker
> > > uses the before_shmem_callback to clear the flag.
> > >
> >
> > +1. This sounds like a better way to clear the flag.
>
> Agreed. Here is the V84 patch which addressed this.
>
> Apart from above, I removed the txn start/end codes from 0001 as they are used
> in the slotsync worker patch. And I also ran pgindent and pgperltidy for the
> patch.
>
Thanks!
A few random comments:
001 ===
"
For
the synchronization to work, it is mandatory to have a physical
replication slot between the primary and the standby,
"
Maybe mention "primary_slot_name" here?
002 ===
+ <para>
+ Synchronize the logical failover slots from the primary server to the standby server.
should we say "logical failover replication slots" instead?
003 ===
+ If, after executing the function,
+ <link linkend="guc-hot-standby-feedback">
+ <varname>hot_standby_feedback</varname></link> is disabled on
+ the standby or the physical slot configured in
+ <link linkend="guc-primary-slot-name">
+ <varname>primary_slot_name</varname></link> is
+ removed,
I think another option that could lead to slot invalidation is if primary_slot_name
is NULL or miss-configured. Indeed hot_standby_feedback would be working
(for the catalog_xmin) but only as long as the standby is up and running.
004 ===
+ on the standby. For the synchronization to work, it is mandatory to
+ have a physical replication slot between the primary and the standby,
should we mention primary_slot_name here?
005 ===
+ To resume logical replication after failover from the synced logical
+ slots, the subscription's 'conninfo' must be altered
Only in a pub/sub context but not for other ways of using the logical replication
slot(s).
006 ===
+ neither be used for logical decoding nor dropped by the user
what about "nor dropped manually"?
007 ===
+typedef struct SlotSyncCtxStruct
+{
Should we remove "Struct" from the struct name?
008 ===
+ ereport(LOG,
+ errmsg("dropped replication slot \"%s\" of dbid %d",
+ NameStr(local_slot->data.name),
+ local_slot->data.database));
We emit a message when an "invalidated" slot is dropped but not when we create
a slot. Shouldn't we emit a message when we create a synced slot on the standby?
I think that could be confusing to see "a drop" message not followed by "a create"
one when it's expected (slot valid on the primary for example).
009 ===
Regarding 040_standby_failover_slots_sync.pl what about adding tests for?
- synced slot invalidation (and ensure it's recreated once pg_sync_replication_slots()
is called and when the slot in primary is valid)
- cannot enable failover for a temporary replication slot
- replication slots can only be synchronized from a standby server
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2024-02-12 10:24:13 | Re: Catalog domain not-null constraints |
Previous Message | Heikki Linnakangas | 2024-02-12 09:58:24 | Re: Remove WIN32 conditional compilation from win32common.c |