Re: Continuous buildfarm failures on hamster with bin-check

From: Noah Misch <noah(at)leadboat(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Continuous buildfarm failures on hamster with bin-check
Date: 2017-06-04 21:12:29
Message-ID: 20170604211229.GA1528911@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 18, 2017 at 09:59:26PM +0900, Michael Paquier wrote:
> On Tue, Apr 18, 2017 at 9:35 PM, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> wrote:
> > On 04/18/2017 08:23 AM, Michael Paquier wrote:
> >> Increasing wal_sender_timeout and wal_receiver_timeout can help in
> >> reducing the failures seen.
> >
> > OK, but you're only talking about a handful of these, right?
>
> Yup, that would be one solution but that's not attacking the problem
> at its root.
>
> > Lets's say we have a bunch of possible environment settings with names
> > that all begin with "PG_TAP_" PostgresNode.pm could check for the
> > existence of these and take action accordingly, and you could set them
> > on a buildfarm animal in the config file, or for interactive use in your
> > .profile.
>
> That's the point I am trying to make upthread: slow buildfarm animals
> should have minimal impact on core code modifications. We could for
> example have one environment variable that lists all the parameters to
> modify in a single string and appends them at the end of
> postgresql.conf. But honestly I don't think that this is necessary if
> there is only one variable able to define a base directory for
> temporary statistics as the real bottleneck comes from there at least
> in the case of hamster. When initializing a node via PostgresNode.pm,
> we would just check for this variable, and the init() routine just
> creates a temporary folder in it, setting up temp_stats_path in
> postgresql.conf.

Each of the above approaches has fairly low impact on the code, so we should
use other criteria to choose. I'd welcome a feature for augmenting every
postgresql.conf of every test suite (a generalization of "pg_regress
--temp-config", which has proven its value). I can envision using it with
force_parallel_mode, default_transaction_isolation, log_*, wal_*_timeout,
autovacuum_naptime, and others.

Even for hamster, I'm skeptical that changing stats_temp_directory would
suffice. Every hamster BinInstallCheck failure since 2017-02-13 had a "LOG:
terminating walsender process due to replication timeout". Most, but not all,
of those replication timeouts followed a "LOG: using stale statistics instead
of current ones because stats collector is not responding". For the remaining
minority, I expect to eventually need wal_sender_timeout. Example:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2017-02-24%2016%3A00%3A06

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2017-06-04 21:16:14 Re: PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity
Previous Message Beena Emerson 2017-06-04 20:13:29 Re: Default Partition for Range