Re: pgbench tap tests & minor fixes.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Nikolay Shaplov <dhyan(at)nataraj(dot)su>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench tap tests & minor fixes.
Date: 2017-09-19 02:54:49
Message-ID: 20170919025449.aprsaz6uegzra62b@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-09-11 15:02:21 -0400, Andrew Dunstan wrote:
>
>
> On 09/11/2017 01:58 PM, Tom Lane wrote:
> > Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> writes:
> >> On 09/08/2017 09:40 AM, Tom Lane wrote:
> >>> Like you, I'm a bit worried about the code for extracting an exit
> >>> status from IPC::Run::run. We'll have to keep an eye on the buildfarm
> >>> for a bit. If there's any trouble, I'd be inclined to drop it down
> >>> to just success/fail rather than checking the exact exit code.
> >> bowerbird seems to have been made unhappy.
> > I saw that failure, but it appears to be a server-side crash:
> >
> > 2017-09-10 19:39:03.395 EDT [1100] LOG: server process (PID 11464) was terminated by exception 0xC0000005
> > 2017-09-10 19:39:03.395 EDT [1100] HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
> >
> > Given the lack of any log outputs from process 11464, it's hard to tell
> > what it was doing, but it seems not to be any of the backends running
> > pgbench queries. So maybe an autovac worker? I dunno. Anyway, it's
> > difficult to credit that this commit caused the failure, even if it did
> > happen during the new test case. I'm inclined to write it off as another
> > one of the random crashes that bowerbird seems prone to.
> >
> > If the failure proves repeatable, then of course we'll need to look
> > more closely.
> >
> >
>
>
> Hmm, it had several failures and now a success. Will keep an eye on it.

There just was another failure like that
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2017-09-19%2001%3A42%3A20
I first thought it might be the new recovery tests, or the changes
leading to its addition, but it's a different test and in the middle of
the run. Even so, I'd have looked towards my commit, except that
there's a number of previous reports that look similar.

Any chance you could get backtraces on these?

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-09-19 03:17:06 Re: parallel.c oblivion of worker-startup failures
Previous Message Tatsuo Ishii 2017-09-19 02:42:38 Re: Statement timeout behavior in extended queries