Re: [HACKERS] Re: pgsql: Make new crash restart test a bit more robust.

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: pgsql: Make new crash restart test a bit more robust.
Date: 2017-09-22 02:37:38
Message-ID: CAEepm=0TE90nded+bNthP45_PEvGAAr=3gxhHJObL4xmOLtX0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Wed, Sep 20, 2017 at 4:42 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-09-19 19:00:38 -0700, Andres Freund wrote:
>> Given this fact pattern, I'll allow the case without a received error
>> message in the recovery test. Objections?
>
> Hearing none. Pushed.
>
> While debugging this, I've also introduced a pump wrapper so that we now
> get:
> ok 4 - exactly one process killed with SIGQUIT
> # aborting wait: program died
> # stream contents: >>psql:<stdin>:9: WARNING: terminating connection because of crash of another server process
> # DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
> # HINT: In a moment you should be able to reconnect to the database and repeat your command.
> # psql:<stdin>:9: server closed the connection unexpectedly
> # This probably means the server terminated abnormally
> # before or while processing the request.
> # psql:<stdin>:9: connection to server was lost
> # <<
> # pattern searched for: (?^m:MAYDAY: terminating connection because of crash of another server process)
> not ok 5 - psql query died successfully after SIGQUIT

Seeing these failures in 013_crash_restart.pl from time to time on
Travis CI. Examples:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/278419122
https://travis-ci.org/postgresql-cfbot/postgresql/builds/278247756

There are a couple of other weird problems in the TAP test that
probably belong on another thread (see build IDs 278302509 and
278247756 which are for different CF patches but exhibit the same
symptom: some test never returns control but we can't see its output,
maybe due to -Otarget, before the whole job is nuked by Travis for not
making progress).

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-09-22 04:04:42 pgsql: Sync our copy of the timezone library with IANA tzcode master.
Previous Message Andrew Dunstan 2017-09-21 23:11:44 pgsql: Provide a test for variable existence in psql

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-09-22 02:54:57 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Masahiko Sawada 2017-09-22 02:34:05 Re: Assertion failure when the non-exclusive pg_stop_backup aborted.