Re: Timeout control within tests

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Timeout control within tests
Date: 2022-02-18 07:19:11
Message-ID: 20220218071911.GB3506226@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 17, 2022 at 09:48:25PM -0800, Andres Freund wrote:
> On 2022-02-17 21:28:42 -0800, Noah Misch wrote:
> > I propose to have environment variable PG_TEST_TIMEOUT_DEFAULT control the
> > timeout used in the places that currently hard-code 180s.
>
> Meson's test runner has the concept of a "timeout multiplier" for ways of
> running tests. Meson's stuff is about entire tests (i.e. one tap test), so
> doesn't apply here, but I wonder if we shouldn't do something similar?

Hmmm. It is good if the user can express an intent that continues to make
sense if we change the default timeout. For the buildfarm use case, a
multiplier is moderately better on that axis (PG_TEST_TIMEOUT_MULTIPLIER=100
beats PG_TEST_TIMEOUT_DEFAULT=18000). For the hacker use case, an absolute
value is substantially better on that axis (PG_TEST_TIMEOUT_DEFAULT=3 beats
PG_TEST_TIMEOUT_MULTIPLIER=.016666).

> That
> way we could adjust different timeouts with one setting, instead of many
> different fobs to adjust?

I expect multiplier vs. absolute value doesn't change the expected number of
settings. If this change proceeds, we'd have three: PG_TEST_TIMEOUT_DEFAULT,
PGCTLTIMEOUT, and PGISOLATIONTIMEOUT. PGCTLTIMEOUT is separate for conceptual
reasons, and PGISOLATIONTIMEOUT is separate for historical reasons. There's
little use case for setting them to unequal values. If Meson can pass down
the overall timeout in effect for the test file, we could compute all three
variables from the passed-down value. Orthogonal to Meson, as I mentioned, we
could eliminate PGISOLATIONTIMEOUT.

timeouts.spec used to have substantial timeouts that had to elapse for the
test to pass. (Commit 741d7f1 ended that era.) A multiplier would have been
a good fit for that use case. If a similar test came back, we'd likely want
two multipliers, a low one for elapsing timeouts and a high one for
non-elapsing timeouts. A multiplier of 10-100 is reasonable for non-elapsing
timeouts, with the exact value being irrelevant on the buildfarm. Setting an
elapsing timeout higher than necessary causes measurable waste.

One could argue for offering both a multiplier variable and an absolute-value
variable. If there's just one variable, I think the absolute-value variable
is more compelling, due to the aforementioned hacker use case. What do you
think?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikael Kjellström 2022-02-18 07:33:38 Re: Time to drop plpython2?
Previous Message Peter Eisentraut 2022-02-18 06:51:56 Re: automatically generating node support functions