Re: can we optimize STACK_DEPTH_SLOP

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: can we optimize STACK_DEPTH_SLOP
Date: 2016-07-06 13:34:49
Message-ID: 19108.1467812089@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> I did a similar(ish) test which is admittedly not as exhaustive as
> using pmap. I instrumented check_stack_depth() itself to keep track of
> a high water mark (and based on Robert's thought process) to keep
> track of the largest increment over the previous checked stack depth.
> This doesn't cover any cases where there's no check_stack_depth() call
> in the call stack at all (but then if there's no check_stack_depth
> call at all it's hard to see how any setting of STACK_DEPTH_SLOP is
> necessarily going to help).

Well, the point of STACK_DEPTH_SLOP is that we don't want to have to
put check_stack_depth calls in every function in the backend, especially
not otherwise-inexpensive leaf functions. So the idea is for the slop
number to cover the worst-case call graph after the last function with a
check. Your numbers are pretty interesting, in that they clearly prove
we need a slop value of at least 40-50K, but they don't really show that
that's adequate.

I'm a bit disturbed by the fact that you seem to be showing maximum
measured depth for the non-outlier tests as only around 40K-ish.
That doesn't match up very well with my pmap results, since in no
case did I see a physical stack size below 188K.

[ pokes around for a little bit... ] Oh, this is interesting: it looks
like the *postmaster*'s stack size is 188K, and of course every forked
child is going to inherit that as a minimum stack depth. What's more,
pmap shows stack sizes near that for all my running postmasters going back
to 8.4. But 8.3 and before show a stack size of 84K, which seems to be
some sort of minimum on this machine; even a trivial "cat" process has
that stack size according to pmap.

Conclusion: something we did in 8.4 greatly bloated the postmaster's
stack space consumption, to the point that it's significantly more than
anything a normal backend does. That's surprising and scary, because
it means the postmaster is *more* exposed to stack SIGSEGV than most
backends. We need to find the cause, IMO.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oskari Saarenmaa 2016-07-06 13:46:53 Don't include MMAP DSM's transient files in basebackup
Previous Message Merlin Moncure 2016-07-06 12:46:00 Re: EXISTS clauses not being optimized in the face of 'one time pass' optimizable expressions