Re: Increasing timeout of poll_query_until for TAP tests

From: Noah Misch <noah(at)leadboat(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing timeout of poll_query_until for TAP tests
Date: 2018-01-01 18:18:01
Message-ID: 20180101181801.GA2925790@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 01, 2018 at 07:55:37PM +0900, Michael Paquier wrote:
> On Sun, Dec 31, 2017 at 09:52:27PM -0800, Noah Misch wrote:
> > Since now() is transaction_timestamp(), $recovery_time precedes or equals
> > $lsn3, and this didn't close the race. Using clock_timestamp() here would
> > work, as does using separate transactions like recovery-test-fixes.patch did.
> > I'll shortly push a fix for this and a similar ordering problem in the
> > standby_5 test, which first appeared subsequent to this thread.
>
> As recovery_target_inclusive is true by default, my conclusion on the
> matter, which was something that my tests on hamster, the now-dead
> buildfarm animal seemed to confirm, is that just getting a timestamp at
> least the value of the LSN from the same transaction was enough to fix
> all the failures. And hamster was really slow. I can follow why
> logically your patch makes sense, so I agree that this is sane. Have you
> spotted failures from the buildfarm?

No, but I checked only the last 90 days. Earlier master (e.g. git checkout
6078770^) with the following patch reproduces the failures on every run:

--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -71,8 +71,8 @@ my ($lsn2, $recovery_txid) = split /\|/, $ret;
$node_master->safe_psql('postgres',
"INSERT INTO tab_int VALUES (generate_series(2001,3000))");
$ret =
- $node_master->safe_psql('postgres', "SELECT pg_current_wal_lsn(), now();");
-my ($lsn3, $recovery_time) = split /\|/, $ret;
+ $node_master->safe_psql('postgres', "SELECT pg_sleep(80), pg_current_wal_lsn(), now();");
+my ($delay_for_autovacuum, $lsn3, $recovery_time) = split /\|/, $ret;

# Even more data, this time with a recovery target name
$node_master->safe_psql('postgres',
@@ -88,6 +88,7 @@ $node_master->safe_psql('postgres',
"INSERT INTO tab_int VALUES (generate_series(4001,5000))");
my $recovery_lsn =
$node_master->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+$node_master->safe_psql('postgres', 'VACUUM'); # write some WAL
my $lsn5 =
$node_master->safe_psql('postgres', "SELECT pg_current_wal_lsn();");

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2018-01-01 18:19:55 Re: [Patch] Checksums for SLRU files
Previous Message Amit Khandekar 2018-01-01 16:13:14 Re: [HACKERS] UPDATE of partition key