Re: buildfarm windows checks / tap tests on windows

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: buildfarm windows checks / tap tests on windows
Date: 2021-03-03 05:56:06
Message-ID: 20210303055606.lst7ri5roqixtxi7@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-03-02 21:20:11 -0800, Andres Freund wrote:
> On 2021-03-02 12:57:57 -0800, Andres Freund wrote:
> > t/003_recovery_targets.pl ............ 7/9
> > # Failed test 'multiple conflicting settings'
> > # at t/003_recovery_targets.pl line 151.
> >
> > # Failed test 'recovery end before target reached is a fatal error'
> > # at t/003_recovery_targets.pl line 177.
> > t/003_recovery_targets.pl ............ 9/9 # Looks like you failed 2 tests of 9.
> > t/003_recovery_targets.pl ............ Dubious, test returned 2 (wstat 512, 0x200)
> > Failed 2/9 subtests
>
> This appears to be caused by stderr in windows docker containers to
> somehow not work quite right. cirrus-ci uses docker on windows.
>
> If you look e.g. at https://cirrus-ci.com/task/6111560255930368, and
> specifically at the relevant log file:
> https://api.cirrus-ci.com/v1/artifact/task/6111560255930368/log/src/test/recovery/tmp_check/log/003_recovery_targets_primary.log
> you can see that it's, uh, less full than we normally expect:
> 1 file(s) copied.
> 1 file(s) copied.
> 1 file(s) copied.
> 1 file(s) copied.
>
> As that test uses the log file to determine the state of servers:
> > my $logfile = slurp_file($node_standby->logfile());
> > ok($logfile =~ qr/multiple recovery targets specified/,
> > 'multiple conflicting settings');
>
> that doesn't work.
>
>
> I was *very* confused by this for a while. But finally the cluebait hit
> when I discovered that stderr works just fine for *other*
> programs. Including the programs that evidently log into
> 003_recovery_targets_primary.log. The problem is that
> pgwin32_is_service() somehow decides that postgres is running as a
> service. Despite that not really being the case (I guess somehow
> internally docker containers are started below a service, and that
> causes the problem).
>
> I hate everything right now. So much.
>
> I think it's quite nasty that postgres just silently starts to log to
> the event log. Why on earth wasn't the solution instead to hardcode that
> as a server parameter in pg_ctl register?
>
> Not sure what a good fix is for this.

FWIW, just forcing pgwin32_is_service() to return false seems to get the
cirrus tests past 003_recovery_targets.pl. Possible it'll not finish due
to other problems (or too tight timeouts I set), but at least this one
can be considered diagnosed I think.

https://cirrus-ci.com/task/5049764917018624?command=windows_worker_buf#L132

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-03 05:56:59 Re: REINDEX backend filtering
Previous Message Andres Freund 2021-03-03 05:47:18 Re: buildfarm windows checks / tap tests on windows