Quick Links

[PATCH v2] Standardize log polling using wait_for_log() across recovery tests

From:	Zakariyah Ali <zakariyahali100(at)gmail(dot)com>
To:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Zakariyah Ali <zakariyahali100(at)gmail(dot)com>
Subject:	[PATCH v2] Standardize log polling using wait_for_log() across recovery tests
Date:	2026-06-20 17:49:06
Message-ID:	20260620174907.122631-1-zakariyahali100@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

This patch replaces open-coded log wait loops with the existing wait_for_log() helper in the recovery test suite, specifically updating 019_replslot_limit.pl, 033_replay_tsp_drops.pl, and 041_checkpoint_at_promote.pl. This removes duplicated polling logic and ensures a consistent approach to waiting for specific log lines during tests.

Signed-off-by: Zakariyah Ali <zakariyahali100(at)gmail(dot)com>
---
src/test/recovery/t/019_replslot_limit.pl | 78 ++++---------------
src/test/recovery/t/033_replay_tsp_drops.pl | 16 ++--
.../recovery/t/041_checkpoint_at_promote.pl | 13 +---
3 files changed, 23 insertions(+), 84 deletions(-)

diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index a412faf51c6..3fdce739965 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -186,18 +186,9 @@ $node_primary->advance_wal(7);
$node_primary->safe_psql('postgres',
'ALTER SYSTEM RESET max_wal_size; SELECT pg_reload_conf()');
$node_primary->safe_psql('postgres', "CHECKPOINT;");
-my $invalidated = 0;
-for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
-{
- if ($node_primary->log_contains(
- 'invalidating obsolete replication slot "rep1"', $logstart))
- {
- $invalidated = 1;
- last;
- }
- usleep(100_000);
-}
-ok($invalidated, 'check that slot invalidation has been logged');
+ok( $node_primary->wait_for_log(
+ 'invalidating obsolete replication slot "rep1"', $logstart),
+ 'check that slot invalidation has been logged');

$result = $node_primary->safe_psql(
'postgres',
@@ -208,17 +199,8 @@ is($result, "rep1|f|t|lost|",
'check that the slot became inactive and the state "lost" persists');

# Wait until current checkpoint ends
-my $checkpoint_ended = 0;
-for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
-{
- if ($node_primary->log_contains("checkpoint complete: ", $logstart))
- {
- $checkpoint_ended = 1;
- last;
- }
- usleep(100_000);
-}
-ok($checkpoint_ended, 'waited for checkpoint to end');
+ok( $node_primary->wait_for_log("checkpoint complete: ", $logstart),
+ 'waited for checkpoint to end');

# The invalidated slot shouldn't keep the old-segment horizon back;
# see bug #17103: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org
@@ -238,18 +220,10 @@ is($oldestseg, $redoseg, "check that segments have been removed");
$logstart = -s $node_standby->logfile;
$node_standby->start;

-my $failed = 0;
-for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
-{
- if ($node_standby->log_contains(
- "This replication slot has been invalidated due to \"wal_removed\".",
- $logstart))
- {
- $failed = 1;
- last;
- }
- usleep(100_000);
-}
+my $failed =
+ $node_standby->wait_for_log(
+ "This replication slot has been invalidated due to \"wal_removed\".",
+ $logstart);
ok($failed, 'check that replication has been broken');

$node_primary->stop;
@@ -374,20 +348,10 @@ $logstart = -s $node_primary3->logfile;
kill 'STOP', $senderpid, $receiverpid;
$node_primary3->advance_wal(2);

-my $msg_logged = 0;
-my $max_attempts = $PostgreSQL::Test::Utils::timeout_default;
-while ($max_attempts-- >= 0)
-{
- if ($node_primary3->log_contains(
- "terminating process $senderpid to release replication slot \"rep3\"",
- $logstart))
- {
- $msg_logged = 1;
- last;
- }
- sleep 1;
-}
-ok($msg_logged, "walsender termination logged");
+ok( $node_primary3->wait_for_log(
+ "terminating process $senderpid to release replication slot \"rep3\"",
+ $logstart),
+ "walsender termination logged");

# Now let the walsender continue; slot should be killed now.
# (Must not let walreceiver run yet; otherwise the standby could start another
@@ -398,19 +362,9 @@ $node_primary3->poll_query_until('postgres',
"lost")
or die "timed out waiting for slot to be lost";

-$msg_logged = 0;
-$max_attempts = $PostgreSQL::Test::Utils::timeout_default;
-while ($max_attempts-- >= 0)
-{
- if ($node_primary3->log_contains(
- 'invalidating obsolete replication slot "rep3"', $logstart))
- {
- $msg_logged = 1;
- last;
- }
- sleep 1;
-}
-ok($msg_logged, "slot invalidation logged");
+ok( $node_primary3->wait_for_log(
+ 'invalidating obsolete replication slot "rep3"', $logstart),
+ "slot invalidation logged");

# Now let the walreceiver continue, so that the node can be stopped cleanly
kill 'CONT', $receiverpid;
diff --git a/src/test/recovery/t/033_replay_tsp_drops.pl b/src/test/recovery/t/033_replay_tsp_drops.pl
index 93c89550511..f9e5c990bcb 100644
--- a/src/test/recovery/t/033_replay_tsp_drops.pl
+++ b/src/test/recovery/t/033_replay_tsp_drops.pl
@@ -130,16 +130,10 @@ $node_primary->safe_psql(
# Standby should fail and should not silently skip replaying the wal
# In this test, PANIC turns into WARNING by allow_in_place_tablespaces.
# Check the log messages instead of confirming standby failure.
-my $max_attempts = $PostgreSQL::Test::Utils::timeout_default * 10;
-while ($max_attempts-- >= 0)
-{
- last
- if (
- $node_standby->log_contains(
- qr!WARNING: ( [A-Z0-9]+:)? creating missing directory: pg_tblspc/!,
- $logstart));
- usleep(100_000);
-}
-ok($max_attempts > 0, "invalid directory creation is detected");
+ok(
+ $node_standby->wait_for_log(
+ qr!WARNING: ( [A-Z0-9]+:)? creating missing directory: pg_tblspc/!,
+ $logstart),
+ "invalid directory creation is detected");

done_testing();
diff --git a/src/test/recovery/t/041_checkpoint_at_promote.pl b/src/test/recovery/t/041_checkpoint_at_promote.pl
index d0783fef9ae..932647e35d5 100644
--- a/src/test/recovery/t/041_checkpoint_at_promote.pl
+++ b/src/test/recovery/t/041_checkpoint_at_promote.pl
@@ -107,17 +107,8 @@ $node_standby->safe_psql('postgres',

# Wait until the previous restart point completes on the newly-promoted
# standby, checking the logs for that.
-my $checkpoint_complete = 0;
-foreach my $i (0 .. 10 * $PostgreSQL::Test::Utils::timeout_default)
-{
- if ($node_standby->log_contains("restartpoint complete", $logstart))
- {
- $checkpoint_complete = 1;
- last;
- }
- usleep(100_000);
-}
-is($checkpoint_complete, 1, 'restart point has completed');
+ok($node_standby->wait_for_log("restartpoint complete", $logstart),
+ 'restart point has completed');

# Kill with SIGKILL, forcing all the backends to restart.
my $psql_timeout = IPC::Run::timer(3600);
--
2.43.0

In response to

[PATCH] Fix loose polling in 019_replslot_limit.pl test at 2026-06-06 20:32:22 from Zakariyah Ali

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniil Davydov	2026-06-20 18:52:06	Re: BUG with accessing to temporary tables of other sessions still exists
Previous Message	Bryan Green	2026-06-20 17:48:05	Re: [PATCH] O_CLOEXEC not honored on Windows - handle inheritance chain