From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? |
Date: | 2022-03-19 08:47:04 |
Message-ID: | 20220319084704.GB2822749@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 10, 2022 at 04:25:27PM -0500, Tom Lane wrote:
> Apropos of that, it's worth noting that wait_for_catchup *is*
> dependent on up-to-date stats, and here's a recent run where
> it sure looks like the timeout cause is AWOL stats collector:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2022-01-10%2004%3A51%3A34
>
> I wonder if we should refactor wait_for_catchup to probe the
> standby directly instead of relying on the upstream's view.
It would be nice. For logical replication tests, do we have a monitoring API
independent of the stats collector? If not and we don't want to add one, a
hacky alternative might be for wait_for_catchup to run a WAL-writing command
every ~20s. That way, if the stats collector misses the datagram about the
standby reaching a certain LSN, the stats collector would have more chances.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Borisov | 2022-03-19 09:52:35 | Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15) |
Previous Message | Julien Rouhaud | 2022-03-19 04:14:59 | Re: pgsql: Add option to use ICU as global locale provider |