Skip site navigation (1) Skip section navigation (2)

Re: Recent SIGSEGV failures in buildfarm HEAD

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Recent SIGSEGV failures in buildfarm HEAD
Date: 2006-12-28 21:02:22
Message-ID: 459430DE.5080801@kaltenbrunner.cc (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Tom Lane wrote:
> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> Tom Lane wrote:
>>> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>>>> ... Maybe something is causing a dramatic
>>>> increase in memory usage that is causing the random failures (in impalas
>>>> case the OOM-killer actually decides to terminate the postmaster) ?
>>> No, most all the failures I've looked at are sig11 not sig9.
> 
>> hmm - still weird and I would not actually consider impala a resource
>> starved box (especially when compared to other buildfarm-members) so
>> there seems to be something strange going on.
> 
> Actually ... one way that a "memory overconsumption" bug could manifest
> as sig11 would be if it's a runaway-recursion issue: usually you get sig11
> when the machine's stack size limit is exceeded.  This doesn't put us
> any closer to localizing the problem, but at least it's a guess about
> the cause?

that sounds like a possibility though I'm not too optimistic this is
indeed the cause of the problem we see.

> 
> I wonder whether there's any way to get the buildfarm script to report a
> stack trace automatically if it finds a core file left behind in the
> $PGDATA directory after running the tests.  Would something like this
> be adequately portable?
> 
> 	if [ -f $PGDATA/core* ]
> 	then
> 		echo bt | gdb $installdir/bin/postgres $PGDATA/core*
> 	fi

hmmm - not sure I like that that much

> 
> Obviously it'd fail if no gdb available, but that seems pretty harmless.
> The other thing that we'd likely need is an explicit "ulimit -c
> unlimited" for machines where core dumps are off by default.

there are other issues with that - gdb might be available but not
actually producing reliable results on certain platforms (some
commercial unixes,windows).

The thing we might might want to do is the buildfarm script overriding
keep_error_builds=0 conditionally in some cases (like detecting a core).

That way we will at least have a useful buildtree for later
examination(which would be removed even if we get a one-time stacktrace
and keep_error_builds is disabled)


Stefan

In response to

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2006-12-28 21:10:34
Subject: Re: TODO: GNU TLS
Previous:From: Stephen FrostDate: 2006-12-28 20:56:48
Subject: Re: TODO: GNU TLS

pgsql-patches by date

Next:From: Heikki LinnakangasDate: 2006-12-28 21:28:48
Subject: Re: Load distributed checkpoint
Previous:From: Bruce MomjianDate: 2006-12-28 20:55:56
Subject: Re: Load distributed checkpoint

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group