Skip site navigation (1) Skip section navigation (2)

Re: Curious buildfarm failures

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Curious buildfarm failures
Date: 2013-01-14 21:50:16
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On 2013-01-14 16:35:48 -0500, Tom Lane wrote:
> Since commit 2065dd2834e832eb820f1fbcd16746d6af1f6037, there have been
> a few buildfarm failures along the lines of
>   -- Commit table drop
>   COMMIT PREPARED 'regress-two';
> ! PANIC:  failed to re-find shared proclock object
> ! PANIC:  failed to re-find shared proclock object
> ! connection to server was lost
> Evidently I bollixed something, but what?  I've been unable to reproduce
> this locally so far.  Anybody see what's wrong?
> Another thing is that dugong has been reproducibly failing with
>  drop cascades to table testschema.atable
>   -- Should succeed
>   DROP TABLESPACE testspace;
> + ERROR:  tablespace "testspace" is not empty
> since the elog-doesn't-return patch (b853eb97) went in.  Maybe this is
> some local problem there but I'm suspicious that there's a connection.
> But what?
> Any insights out there?

It also has:

LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
FATAL:  could not open file "base/16384/28182": No such file or directory
CONTEXT:  writing block 6 of relation base/16384/28182
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c", Line: 1743)
LOG:  checkpointer process (PID 30366) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes
LOG:  abnormal database system shutdown

================== stack trace: pgsql.9958/src/test/regress/tmp_check/data/core ==================
Using host libthread_db library "/lib/tls/".

warning: Can't read pathname for load map: Input/output error.
Core was generated by `postgres: checkpointer process                                                '.
Program terminated with signal 6, Aborted.

#0  0xa000000000010620 in __kernel_syscall_via_break ()
#0  0xa000000000010620 in __kernel_syscall_via_break ()
#1  0x2000000000953bb0 in raise () from /lib/tls/
#2  0x2000000000956df0 in abort () from /lib/tls/
#3  0x4000000000b4b510 in ExceptionalCondition (
    conditionName=0x4000000000d76390 "!(PrivateRefCount[i] == 0)", 
    errorType=0x4000000000d06500 "FailedAssertion", 
    fileName=0x4000000000d76260 "bufmgr.c", lineNumber=1743) at assert.c:54
#4  0x40000000007a7d20 in AtProcExit_Buffers (code=1, arg=59) at bufmgr.c:1743
#5  0x40000000007c4e50 in shmem_exit (code=1) at ipc.c:221
#6  0x40000000007c4fa0 in proc_exit_prepare (code=1) at ipc.c:181
#7  0x40000000007c4ab0 in proc_exit (code=1) at ipc.c:96
#8  0x4000000000b5d390 in errfinish (dummy=0) at elog.c:518
#9  0x4000000000823380 in _mdfd_getseg (reln=0x6000000000155420, 
    forknum=1397792, blkno=6, skipFsync=0 '\0', behavior=EXTENSION_FAIL)
    at md.c:577
#10 0x400000000081e5c0 in mdwrite (reln=0x6000000000155420, 
    forknum=MAIN_FORKNUM, blocknum=6, buffer=0x2000000001432ea0 "", 
    skipFsync=0 '\0') at md.c:735
#11 0x4000000000824690 in smgrwrite (reln=0x6000000000155420, 
    forknum=MAIN_FORKNUM, blocknum=6, buffer=0x2000000001432ea0 "", 
    skipFsync=0 '\0') at smgr.c:534
#12 0x400000000079e510 in FlushBuffer (buf=0x1, reln=0x6000000000155420)
    at bufmgr.c:1941
#13 0x40000000007a10b0 in SyncOneBuffer (buf_id=0, skip_recently_used=0 '\0')
    at bufmgr.c:1677
#14 0x40000000007a0c00 in CheckPointBuffers (flags=5) at bufmgr.c:1284
#15 0x40000000001fcbf0 in CheckPointGuts (checkPointRedo=80827000, flags=5)
    at xlog.c:7391
#16 0x40000000001fb2a0 in CreateCheckPoint (flags=5) at xlog.c:7240
#17 0x40000000001f6820 in ShutdownXLOG (code=14699520, 
    arg=4611686018440093920) at xlog.c:6823
#18 0x400000000072d780 in _setjmp_lpad_CheckpointerMain_0$0$18 ()
    at checkpointer.c:413
#19 0x4000000000235810 in AuxiliaryProcessMain (argc=496536, 
    argv=0x60000fffff80e520) at bootstrap.c:433
#20 0x40000000007172b0 in StartChildProcess (type=508288) at postmaster.c:4956
#21 0x4000000000713f50 in reaper (postgres_signal_arg=30365)
    at postmaster.c:2568
#22 <signal handler called>
#23 0xa000000000010620 in __kernel_syscall_via_break ()
#24 0x2000000000953f70 in sigprocmask () from /lib/tls/
#25 0x4000000000720480 in ServerLoop () at postmaster.c:1521
#26 0x400000000071d9d0 in PostmasterMain (argc=6, argv=0x60000000000d85e0)
    at postmaster.c:1244
#27 0x4000000000577a30 in main (argc=6, argv=0x60000000000d8010) at main.c:197

in the log. So it seems like it also could be related to locking
changes although I don't immediately see why.


Andres Freund

 Andres Freund	         
 PostgreSQL Development, 24x7 Support, Training & Services

In response to


pgsql-hackers by date

Next:From: Peter EisentrautDate: 2013-01-14 21:56:05
Previous:From: Tom LaneDate: 2013-01-14 21:35:48
Subject: Curious buildfarm failures

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group