Re: Timing-sensitive case in src/test/recovery TAP tests

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timing-sensitive case in src/test/recovery TAP tests
Date: 2017-06-26 01:48:10
Message-ID: CAMsr+YELP+j4OypCoYtRL_Z-KoYrDc+RdqX56mvTBTPqL_55VA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 June 2017 at 05:10, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I've been experimenting with a change to pg_ctl, which I'll post
> separately, to reduce its reaction time so that it reports success
> more quickly after a wait for postmaster start/stop. I found one
> case in "make check-world" that got a failure when I reduced the
> reaction time to ~1ms. That's the very last test in 001_stream_rep.pl,
> "cascaded slot xmin reset after startup with hs feedback reset", and
> the cause appears to be that it's not allowing any delay time for a
> replication slot's state to update after a postmaster restart.
>
> This seems worth fixing independently of any possible code changes,
> because it shows that this test could fail on a slow or overloaded
> machine. I couldn't find any instances of such a failure in the
> buildfarm archives, but that may have a lot to do with the fact that
> owners of slow buildfarm animals are (mostly?) not running this test.
>
> Some experimentation says that the minimum delay needed to make it
> work reliably on my workstation is about 100ms, so a simple patch
> along the lines of the attached might be good enough. I find this
> approach conceptually dissatisfying, though, since it's still
> potentially vulnerable to the failure under sufficient load.
> I wonder if there is an easy way to improve that ... maybe convert
> to something involving poll_query_until?

This should do the trick:

$node_standby_1->poll_query_until('postgres', q[SELECT xmin IS NULL
from pg_replication_slots WHERE slot_name = '] . $slotname_2 . q[']);

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-06-26 01:54:11 Re: Setting pd_lower in GIN metapage
Previous Message Chapman Flack 2017-06-26 01:20:08 Re: AdvanceXLInsertBuffer vs. WAL segment compressibility