random failing builds on spoonbill - backends not exiting...

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: random failing builds on spoonbill - backends not exiting...
Date: 2012-06-22 18:16:20
Message-ID: 4FE4B674.3020500@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It has now happened at least twice that builds on spponbill started to
fail after it failed during ECPGcheck:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2012-06-19%2023%3A00%3A04

the first failure was:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spoonbill&dt=2012-05-24%2023%3A00%3A05

so in both cases the postmaster was not shuting down properly and it was
in fact still running - I have attached gdb to to the still running backend:

(gdb) bt
#0 0x0000000208eb5928 in poll () from /usr/lib/libc.so.62.0
#1 0x000000020a972b88 in _thread_kern_poll (wait_reqd=Variable
"wait_reqd" is not available.
) at /usr/src/lib/libpthread/uthread/uthread_kern.c:784
#2 0x000000020a973d04 in _thread_kern_sched (scp=0x0) at
/usr/src/lib/libpthread/uthread/uthread_kern.c:384
#3 0x000000020a96c080 in select (numfds=Variable "numfds" is not available.
) at /usr/src/lib/libpthread/uthread/uthread_select.c:170
#4 0x00000000003a2894 in ServerLoop () at postmaster.c:1321
#5 0x00000000003a45ac in PostmasterMain (argc=Variable "argc" is not
available.
) at postmaster.c:1121
#6 0x0000000000326df8 in main (argc=6, argv=0xffffffffffff14f8) at
main.c:199
(gdb) print Shutdown
$2 = 2
(gdb) print pmState
$3 = PM_WAIT_BACKENDS
(gdb) p *(Backend *) (BackendList->dll_head)
Cannot access memory at address 0x0
(gdb) p *BackendList
$9 = {dll_head = 0x0, dll_tail = 0x0}

all processes are still running:

pgbuild 18020 0.0 1.2 5952 12408 ?? I Wed04AM 0:03.98
/home/pgbuild/pgbuildfarm/HEAD/pgsql.5709/src/interfaces/ecpg/test/./tmp_check/install//home/pgbuild/pgbuildfarm/HEAD/inst/bin/postgres
-D /
pgbuild 21483 0.0 0.7 6088 7296 ?? Is Wed04AM 0:00.68
postgres: checkpointer process (postgres)
pgbuild 12480 0.0 0.4 5952 4464 ?? Ss Wed04AM 0:06.88
postgres: writer process (postgres)
pgbuild 9841 0.0 0.5 5952 4936 ?? Ss Wed04AM 0:06.92
postgres: wal writer process (postgres)
pgbuild 623 0.1 0.6 7424 6288 ?? Ss Wed04AM 4:16.76
postgres: autovacuum launcher process (postgres)
pgbuild 30949 0.0 0.4 6280 3896 ?? Ss Wed04AM 0:40.94
postgres: stats collector process (postgres)

sending a manual kill -15 to either of them does not seem to make them
exit either...

I did some further investiagations with robert on IM but I don't think
he has any further ideas other than that I have a weird OS :)
It seems worth noticing that this is OpenBSD 5.1 on Sparc64 which has a
new threading implementation compared to older OpenBSD versions.

Stefan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-22 18:34:28 Re: random failing builds on spoonbill - backends not exiting...
Previous Message Euler Taveira 2012-06-22 16:38:20 Re: libpq compression