Re: fairywren hung in pg_basebackup tests

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: fairywren hung in pg_basebackup tests
Date: 2022-07-26 04:53:47
Message-ID: 20220726045347.GA3528716@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 25, 2022 at 11:35:12AM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Mon, Jul 25, 2022 at 09:44:21AM -0400, Andrew Dunstan wrote:
> >> Perhaps we should have a guard in system_or_bail() and/or system_log()
> >> which bails if some element of @_ is undefined.
>
> +1, seeing how hard this is to diagnose.
>
> > That would be reasonable. Also reasonable to impose some long timeout, maybe
> > 10x or 100x PG_TEST_TIMEOUT_DEFAULT, on calls to those functions.
>
> Why would it need to be more than PG_TEST_TIMEOUT_DEFAULT?

We run some long commands, like the parallel_schedule runs. Those currently
use plain system(), but they probably should have used system_log() from a
logging standpoint. If they had, PG_TEST_TIMEOUT_DEFAULT would have been too
short. One could argue that anything that slow should declare its intent to
be that slow, but that argument is getting into the territory of a policy
change rather than a backstop for clearly-unintended longevity.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-07-26 05:01:15 Re: Introduce wait_for_subscription_sync for TAP tests
Previous Message Peter Smith 2022-07-26 04:53:05 Re: Handle infinite recursion in logical replication setup