From: | Christoph Berg <christoph(dot)berg(at)credativ(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, bastian(dot)blank(at)credativ(dot)de |
Subject: | Re: 9.4 beta1 crash on Debian sid/i386 |
Date: | 2014-05-19 14:47:17 |
Message-ID: | 20140519144717.GG7296@msgid.df7cb.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Re: Andres Freund 2014-05-19 <20140519141221(dot)GC5098(at)alap3(dot)anarazel(dot)de>
> On 2014-05-19 09:53:11 -0400, Tom Lane wrote:
> > I think throwing an error out of a SIGBUS handler is right out. There
> > would be no way to know exactly what code we were interrupting. It's
> > the same reason we don't let, eg, the SIGALRM handler throw a timeout
> > error directly (in most places anyway).
Right. I just mentioned that for completeness.
> Agreed. I think if we really, really feel the need to do something about
> this - which I don't - we could allocate a separate stack very early on
> and use that.
Hmm, that'd be an extension of the other idea, "write something deep
in the stack on startup". This is probably less evil, though I agree
it's a big hammer for solving something that should probably be fixed
elsewhere.
> > >> * PostgreSQL allocates lots of heap using brk() instead of mmap()
> >
> > > It doesn't really do that, btw. It's the libc's mmap that makes those
> > > decisions, not postgres.
> >
> > It occurs to me that maybe this is a glibc bug, not a kernel bug?
>
> You think malloc() should try to be careful when calling brk() and check
> beforehand wether it'll conflict with stack_base + RLIMIT_STACK? That's
> not a bad argument, but it still seems a really bad choice to leave that
> little space for the heap. Especially when it's dependant on -pie being
> used.
It's probably both, the default ASLR layout providing too little heap,
plus malloc() running into the stack area - I'm not sure if the former
is the kernel's fault or libc/ld.so's, probably they need to work
together on that anyway.
Disabling -pie for all 32bit archs seems to be the way to go for us
now.
Does this topic warrant being mentioned in the docs?
Christoph
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-05-19 15:08:04 | Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY) |
Previous Message | Bruce Momjian | 2014-05-19 14:23:43 | Re: 9.4 release notes |