Re: Fixing WAL instability in various TAP tests

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fixing WAL instability in various TAP tests
Date: 2021-09-28 18:11:04
Message-ID: 2854602.1632852664@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> So there's more than one symptom, but in any case it seems like
> we have an issue in WAL replay. I wonder whether it's bloom's fault
> or a core bug.

Actually ... I bet it's just the test script's fault. It waits for the
standby to catch up like this:

my $caughtup_query =
"SELECT pg_current_wal_lsn() <= write_lsn FROM pg_stat_replication WHERE application_name = '$applname';";
$node_primary->poll_query_until('postgres', $caughtup_query)
or die "Timed out while waiting for standby 1 to catch up";

which seems like completely the wrong condition. Don't we need the
standby to have *replayed* the WAL, not merely written it to disk?

I'm also wondering why this doesn't use wait_for_catchup, instead
of reinventing the query to use.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2021-09-28 18:15:01 Re: [PATCH] Support pg_ident mapping for LDAP
Previous Message Jacob Champion 2021-09-28 18:08:16 Re: [PATCH] Support pg_ident mapping for LDAP