Isn't wait_for_catchup slightly broken?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Isn't wait_for_catchup slightly broken?
Date: 2022-01-10 19:31:38
Message-ID: 2368336.1641843098@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While trying to make sense of some recent buildfarm failures,
I happened to notice that the default query issued by
the TAP sub wait_for_catchup looks like

SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = '<whatever>';

ISTM there are two things wrong with this:

1. Since pg_current_wal_lsn() is re-evaluated each time, we're
effectively setting a moving target for the standby to reach.
Admittedly we're not going to be issuing any new DML while
waiting in wait_for_catchup, but background activity such as
autovacuum could be creating new WAL. Thus, the test is likely
to wait longer than it needs to. In the worst case, we'd never
catch up until the primary server has been totally quiescent
for awhile.

2. Aside from being slower than necessary, this also makes the
test squishy and formally incorrect, because the standby might
get the opportunity to replay more WAL than the test intends.

So I think we need to fix it to capture the target WAL position
at the start, as I've done in the attached patch. In principle
this might make things a bit slower because of the extra
transaction required, but I don't notice any above-the-noise
difference on my own workstation.

Another thing that is bothering me a bit is that a number of the
callers use $node->lsn('insert') as the target. This also seems
rather dubious, because that could be ahead of what's been written
out. These callers are just taking it on faith that something will
eventually cause that extra WAL to get written out (and become
available to the standby). Again, that seems to make the test
slower than it need be, with a worst-case scenario being that it
eventually times out. Admittedly this is unlikely to be a big
problem unless some background op issues an abortive transaction
at just the wrong time. Nonetheless, I wonder if we shouldn't
standardize on "thou shalt use the write position", because I
don't think the other alternatives have anything to recommend them.
I've not addressed that below, though I did tweak the comment about
that parameter.

Thoughts?

regards, tom lane

Attachment Content-Type Size
fix-wait-for-catchup-default-behavior.patch text/x-diff 1.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Allie Crawford 2022-01-10 19:53:42 Stream Replication not working
Previous Message Bossart, Nathan 2022-01-10 19:01:32 Re: make MaxBackends available in _PG_init