Re: Introduce wait_for_subscription_sync for TAP tests

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce wait_for_subscription_sync for TAP tests
Date: 2022-07-26 05:01:15
Message-ID: CAA4eK1Jg9KQXp065efTkW1w-m9hEexMUTVf-7AHwdusTWVAncw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 26, 2022 at 7:07 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Hi,
>
> In tap tests for logical replication, we have the following code in many places:
>
> $node_publisher->wait_for_catchup('tap_sub');
> my $synced_query =
> "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT
> IN ('r', 's');";
> $node_subscriber->poll_query_until('postgres', $synced_query)
> or die "Timed out while waiting for subscriber to synchronize data";
>
> Also, we sometime forgot to check either one, like we fixed in commit
> 1f50918a6fb02207d151e7cb4aae4c36de9d827c.
>
> I think we can have a new function to wait for all subscriptions to
> synchronize data. The attached patch introduce a new function
> wait_for_subscription_sync(). With this function, we can replace the
> above code with this one function as follows:
>
> $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
>

+1. This reduces quite some code in various tests and will make it
easier to write future tests.

Few comments/questions:
====================
1.
-$node_publisher->wait_for_catchup('mysub1');
-
-# Also wait for initial table sync to finish.
-my $synced_query =
- "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT
IN ('r', 's');";
-$node_subscriber->poll_query_until('postgres', $synced_query)
- or die "Timed out while waiting for subscriber to synchronize data";
-
# Also wait for initial table sync to finish.
-$node_subscriber->poll_query_until('postgres', $synced_query)
- or die "Timed out while waiting for subscriber to synchronize data";
+$node_subscriber->wait_for_subscription_sync($node_publisher, 'mysub1');

It seems to me without your patch there is an extra poll in the above
test. If so, we can probably remove that in a separate patch?

2.
+ # wait for the replication to catchup if required.
+ if (defined($publisher))
+ {
+ croak 'subscription name must be specified' unless defined($subname);
+ $publisher->wait_for_catchup($subname, 'replay');
+ }
+
+ # then, wait for all table states to be ready.
+ print "Waiting for all subscriptions in \"$name\" to synchronize data\n";
+ my $query = qq[SELECT count(1) = 0
+ FROM pg_subscription_rel
+ WHERE srsubstate NOT IN ('r', 's');];
+ $self->poll_query_until($dbname, $query)
+ or croak "timed out waiting for subscriber to synchronize data";

In the tests, I noticed that a few places did wait_for_catchup after
the subscription check, and at other places, we did that check before
as you have it here. Ideally, I think wait_for_catchup should be after
confirming the initial sync is over as without initial sync, the
publisher node won't be completely in sync with the subscriber. What
do you think?

3. In the code quoted in the previous point, why did you pass the
second parameter as 'replay' when we have not used that in the tests
otherwise?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-07-26 05:04:02 Re: Allow file inclusion in pg_hba and pg_ident files
Previous Message Noah Misch 2022-07-26 04:53:47 Re: fairywren hung in pg_basebackup tests