Re: Tracing down buildfarm "postmaster does not shut down" failures

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Tracing down buildfarm "postmaster does not shut down" failures
Date: 2016-02-10 02:33:00
Message-ID: 56BAA15C.2010603@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/09/2016 08:49 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 02/09/2016 07:49 PM, Tom Lane wrote:
>>> However, I'd already noted from some other digging in the buildfarm
>>> logs that axolotl's speed seems to vary tremendously. I do not
>>> know what else you typically run on that hardware, but putting it
>>> under full load might help prove things.
>> Almost nothing. It's a Raspberry Pi 2B that I got mainly to run a
>> buildfarm animal. About the only other thing of note is a very lightly
>> configured Nagios instance.
> Huh, that's quite strange. There is one fairly recent report of
> axolotl failing "make check" because of taking over a minute to shut down:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-12-14%2020%3A30%3A52
>
> but the runs before and after that show shutdown times of only a second or
> two.

No idea why.

anyway, we got a failure pretty quickly:

pg_ctl: server does not shut down at 2016-02-09 21:10:11.914 EST

pg_regress: could not stop postmaster: exit code was 256
Makefile:81: recipe for target 'check' failed
make: *** [check] Error 2

The log traces from the shutdown are below.

cheers

andrew

LOG: received fast shutdown request at 2016-02-09 21:09:11.824 EST
LOG: aborting any active transactions
LOG: autovacuum launcher shutting down at 2016-02-09 21:09:11.830 EST
LOG: shutting down at 2016-02-09 21:09:11.839 EST
LOG: checkpoint starting: shutdown immediate
LOG: CheckPointGuts starting at 2016-02-09 21:09:11.840 EST
LOG: CheckPointCLOG() done at 2016-02-09 21:09:11.840 EST
LOG: CheckPointCommitTs() done at 2016-02-09 21:09:11.840 EST
LOG: CheckPointSUBTRANS() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointMultiXact() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointPredicate() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointRelationMap() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointReplicationSlots() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointSnapBuild() done at 2016-02-09 21:09:11.841 EST
LOG: CheckPointLogicalRewriteHeap() done at 2016-02-09 21:09:11.842 EST
LOG: BufferSync(5) beginning to write 172 buffers at 2016-02-09
21:09:11.845 EST
LOG: BufferSync(5) done, wrote 172/172 buffers at 2016-02-09
21:09:14.635 EST
LOG: CheckPointBuffers() done at 2016-02-09 21:09:14.636 EST
LOG: CheckPointReplicationOrigin() done at 2016-02-09 21:09:14.636 EST
LOG: CheckPointGuts done at 2016-02-09 21:09:14.636 EST
LOG: checkpoint WAL record flushed at 2016-02-09 21:09:14.636 EST
LOG: pg_control updated at 2016-02-09 21:09:14.637 EST
LOG: smgrpostckpt() done at 2016-02-09 21:09:14.668 EST
LOG: RemoveOldXlogFiles() done at 2016-02-09 21:09:14.668 EST
LOG: TruncateSUBTRANS() done at 2016-02-09 21:09:14.669 EST
LOG: checkpoint complete: wrote 172 buffers (1.0%); 0 transaction log
file(s) added, 0 removed, 0 recycled; write=2.794 s, sync=0.000 s,
total=2.829 s; sync files=0, longest=0.000 s, average=0.000 s;
distance=966 kB, estimate=966 kB
LOG: shutdown checkpoint complete at 2016-02-09 21:09:14.669 EST
LOG: ShutdownCLOG() complete at 2016-02-09 21:09:14.669 EST
LOG: ShutdownCommitTs() complete at 2016-02-09 21:09:14.669 EST
LOG: ShutdownSUBTRANS() complete at 2016-02-09 21:09:14.669 EST
LOG: database system is shut down at 2016-02-09 21:09:14.669 EST
LOG: doing before_shmem_exit 0 at 2016-02-09 21:09:14.670 EST
LOG: doing on_shmem_exit 2 at 2016-02-09 21:09:14.670 EST
LOG: doing on_shmem_exit 1 at 2016-02-09 21:09:14.670 EST
LOG: doing on_shmem_exit 0 at 2016-02-09 21:09:14.670 EST
LOG: doing on_proc_exit 1 at 2016-02-09 21:09:14.670 EST
LOG: doing on_proc_exit 0 at 2016-02-09 21:09:14.671 EST
LOG: calling exit(0) at 2016-02-09 21:09:14.671 EST
LOG: checkpointer dead at 2016-02-09 21:09:14.683 EST
LOG: all children dead at 2016-02-09 21:10:11.184 EST
LOG: doing on_shmem_exit 3 at 2016-02-09 21:10:11.191 EST
LOG: doing on_shmem_exit 2 at 2016-02-09 21:10:11.191 EST
LOG: doing on_shmem_exit 1 at 2016-02-09 21:10:11.192 EST
LOG: doing on_shmem_exit 0 at 2016-02-09 21:10:11.208 EST
LOG: doing on_proc_exit 1 at 2016-02-09 21:10:11.209 EST
LOG: doing on_proc_exit 0 at 2016-02-09 21:10:11.209 EST
LOG: lock files all released at 2016-02-09 21:10:11.211 EST
LOG: calling exit(0) at 2016-02-09 21:10:11.211 EST

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-02-10 02:41:27 Re: Tracing down buildfarm "postmaster does not shut down" failures
Previous Message Masahiko Sawada 2016-02-10 02:25:49 Re: Support for N synchronous standby servers - take 2