Re: 9.4 beta1 crash on Debian sid/i386

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Christoph Berg <christoph(dot)berg(at)credativ(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, bastian(dot)blank(at)credativ(dot)de
Subject: Re: 9.4 beta1 crash on Debian sid/i386
Date: 2014-05-19 12:43:13
Message-ID: 20140519124313.GA5098@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-05-19 13:53:18 +0200, Christoph Berg wrote:
> I've done some more digging. The problem exists also on plain 32bit
> kernels, not only 64bit running a 32bit userland. (Tested on Debian
> Wheezy's 3.2.57 kernel.)

Too bad.

> Debian/Ubuntu have been using hardened PostgreSQL builds for years
> now, including running the regression tests - apparently we were
> always close to a crash, it just had not happened yet.

There might be some user defined workloads triggering it as well...

> So there's a few points to consider:
> * ASLR leaves only 125MB for brk()-style heap plus stack
> * RLIMIT_STACK is treated as an upper limit, not a reservation
> * PostgreSQL thinks max_stack_depth=2MB plus check_stack_depth() is
> safe, instead of having a SIGBUS handler
> * PostgreSQL allocates lots of heap using brk() instead of mmap()

* postgres on debian is build with -pie.

> If any of that wouldn't hold, the problem wouln't appear.

> I'm not sure where to go from here. Getting the kernel (or the libc)
> changed seems hard, and that would probably only affect future
> distributions anyway.

Hm, this certainly looks like the kind of bug that should get backported
to -stable et al.

> A short-term fix might be to reduce
> max_stack_depth for the regression tests, which tests the
> functionality, but leaves the problem open for production.
> Implementing a SIGBUS/SIGSEGV handler would probably mean that the
> whole ouch-lets-restart-on-error logic would become ineffective,
> unless we go check with address caused the error and decided if it was
> part of the stack or not.

Meh. I am pretty staunchly set against trying this. This is putting
complex tape over the problem. And we'd have significant problems
discerning the different kinds of SIGBUS errors or such.

Isn't the far more obvious thing ot just not build postgres with -pie on
32bit? It's hardly a security benefit if it allows plain user to crash
the server.
Besides the stack problem, have you measured whether it's viable to use
-pie on 32bit performancewise? That's stuff not that cheap, especially
on 32bit.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-05-19 12:46:48 Re: 9.4 beta1 crash on Debian sid/i386
Previous Message Christoph Berg 2014-05-19 11:53:18 Re: 9.4 beta1 crash on Debian sid/i386