Re: pgsql: Add test for postmaster crash restarts.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Add test for postmaster crash restarts.
Date: 2017-09-19 16:47:18
Message-ID: 20170919164718.66hcedq2rtlkntvf@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2017-09-19 12:13:54 -0400, Tom Lane wrote:
> IOW, the "$monitor" instance of psql did not complete making its
> connection until after the crash/restart cycle had occurred.

That'd be easy enough to fix...

Just something like

$monitor_stdin .= q[
SELECT $$am-i-up$$;
];
$monitor->pump until $monitor_stdout =~ /am-i-up/;
$monitor_stdout = '';

> So we're just sitting there waiting for a crash report that won't
> come. Which is another very serious deficiency in this test:
> lacking any sort of timeout, it will just freeze indefinitely
> if anything doesn't happen exactly the way it expects. From a
> buildfarm owner's standpoint, that's pretty damn unfriendly.
> It means having to manually unwedge your animals from time to time.

Note that I just copied the code for that from another test - this is
isn't unique to this test. I agree that it'd be good to add a timeout to
those pump calls.

> I'd like to ask you to revert this test, at least pending making
> it a whole lot more bulletproof.

Hm. Ok. That seems like an overreaction to me - the failure rate isn't
actualy that high so far. I'm happy to add both timeouts and "earlier
startup" of the $monitor, but I'd prefer to do so in-tree - I'd run the
test through 100+ iterations locally, without any of this showing up.

> We don't really need crash recovery testing in the buildfarm IMO ---
> we hackers crash the system plenty often enough to notice problems
> there.

I for one don't exercise that kind of crash restarts, my development
scripts all work with restart_after_crash = false. What I find more
concerning however is coverage of EXEC_BACKEND, which has far fewer
developers actively running it constantly.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2017-09-19 16:51:11 Re: Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.
Previous Message Tom Lane 2017-09-19 16:24:00 Re: Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-09-19 16:51:11 Re: Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.
Previous Message Tom Lane 2017-09-19 16:45:39 Re: PG 10 release notes