Re: Synchronizing slots from primary to standby

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2023-11-22 07:41:43
Message-ID: CAHut+PuEGX5kr0xh06yv8ndoAQvDNedoec1OqOq3GMxDN6p=9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In addition to my recent v35-0001 comment not yet addressed [1], here
are some review comments for patch v37-0001.

======
src/backend/replication/walsender.c

1. PhysicalWakeupLogicalWalSnd
+/*
+ * Wake up logical walsenders with failover-enabled slots if the physical slot
+ * of the current walsender is specified in standby_slot_names GUC.
+ */
+void
+PhysicalWakeupLogicalWalSnd(void)
+{
+ ListCell *lc;
+ List *standby_slots;
+ bool slot_in_list = false;
+
+ Assert(MyReplicationSlot != NULL);
+ Assert(SlotIsPhysical(MyReplicationSlot));
+
+ standby_slots = GetStandbySlotList(false);
+
+ foreach(lc, standby_slots)
+ {
+ char *name = lfirst(lc);
+
+ if (strcmp(name, NameStr(MyReplicationSlot->data.name)) == 0)
+ {
+ slot_in_list = true;
+ break;
+ }
+ }
+
+ if (slot_in_list)
+ ConditionVariableBroadcast(&WalSndCtl->wal_confirm_rcv_cv);
+}

1a.
Easier to have single assertion -- Assert(MyReplicationSlot &&
SlotIsPhysical(MyReplicationSlot));

~

1b.
Why bother with the 'slot_in_list' and break, when you can just call
the ConditionVariableBroadcast() and return without having the extra
variable?

======
src/test/recovery/t/050_verify_slot_order.pl

~~~

2.
Should you name the global objects with a 'regress_' prefix which
seems to be the standard for other new TAP tests?

~~~

3.
+#
+# | ----> standby1 (connected via streaming replication)
+# | ----> standby2 (connected via streaming replication)
+# primary ----- |
+# | ----> subscriber1 (connected via logical replication)
+# | ----> subscriber2 (connected via logical replication)
+#
+#
+# Set up is configured in such a way that primary never lets subscriber1 ahead
+# of standby1.

3a.
Misaligned "|" in comment?

~

3b.
IMO it would be better to give an overview of how this all works
instead of just saying "configured in such a way".

~~~

4.
+# Configure primary to disallow specified logical replication slot (lsub1_slot)
+# getting ahead of specified physical replication slot (sb1_slot).
+$primary->append_conf(

It is confusing because there is no "lsub1_slot" specified anywhere
until much later. Would you be able to provide some more details?

~~~

5.
+# Create another subscriber node, wait for sync to complete
+my $subscriber2 = PostgreSQL::Test::Cluster->new('subscriber2');
+$subscriber2->init(allows_streaming => 'logical');
+$subscriber2->start;
+$subscriber2->safe_psql('postgres', "CREATE TABLE tab_int (a int
PRIMARY KEY);");
+$subscriber2->safe_psql('postgres',
+ "CREATE SUBSCRIPTION mysub2 CONNECTION '$publisher_connstr' "
+ . "PUBLICATION mypub WITH (slot_name = lsub2_slot);");
+$subscriber2->wait_for_subscription_sync;

Maybe this comment should explicitly say there is no failover enabled
here. Maybe the SUBSCRIPTION should explicitly set failover=false?

~~~

6.
+# The subscription that's up and running and is enabled for failover
+# doesn't get the data from primary and keeps waiting for the
+# standby specified in standby_slot_names.
+$result = $subscriber1->safe_psql('postgres',
+ "SELECT count(*) = 0 FROM tab_int;");
+is($result, 't', "subscriber1 doesn't get data from primary until
standby1 acknowledges changes");

Might it be better to write as "SELECT count(*) = $primary_row_count
FROM tab_int;" and expect it to return false?

======
src/test/regress/expected/subscription.out

7.
Everything here displays the "Failover" state 'd' (disabled). How
about tests for different state values?

======
[1] https://www.postgresql.org/message-id/CAHut%2BPv-yu71ogj_hRi6cCtmD55bsyw7XTxj1Nq8yVFKpY3NDQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-11-22 07:59:59 Re: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Andres Freund 2023-11-22 07:37:30 Re: remaining sql/json patches