Re: stress test for parallel workers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Mark Wong <mark(at)2ndquadrant(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: stress test for parallel workers
Date: 2019-10-12 21:25:52
Message-ID: 17884.1570915552@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've now also been able to reproduce the "infinite_recurse" segfault
on wobbegong's host (or, since I was using a gcc build, I guess I
should say vulpes' host). The first-order result is that it's the
same problem with the kernel not giving us as much stack space as
we expect: there's only 1179648 bytes in the stack segment in the
core dump, though we should certainly have been allowed at least 8MB.

The next interesting thing is that looking closely at the identified
spot of the SIGSEGV, there's nothing there that should be touching
the stack at all:

(gdb) x/4i $pc
=> 0x10201df0 <core_yylex+1072>: ld r9,0(r30)
0x10201df4 <core_yylex+1076>: ld r8,128(r30)
0x10201df8 <core_yylex+1080>: ld r10,152(r30)
0x10201dfc <core_yylex+1084>: ld r9,0(r9)

(r30 is not pointing at the stack, but at a valid heap location.)
This code is the start of the switch case at scan.l:1064, so the
most recent successfully-executed instructions were the switch jump,
and they don't involve the stack either.

The reported sp,

(gdb) i reg sp
sp 0x7fffe6940890 140737061849232

is a good 2192 bytes above the bottom of the allocated stack space,
which is 0x7fffe6940000 according to gdb. So we really ought to
have plenty of margin here. What's going on?

What I suspect, given the difficulty of reproducing this, is that
what really happened is that the kernel tried to deliver a SIGUSR1
signal to us just at this point. The kernel source code that
Thomas pointed to comments that

* The kernel signal delivery code writes up to about 1.5kB
* below the stack pointer (r1) before decrementing it.

There's more than 1.5kB available below sp, but what if that comment
is a lie? In particular, I'm wondering if that number dates to PPC32
and needs to be doubled, or nearly so, to describe PPC64 reality.
If that were the case, then the signal code would not have been
able to fit its requirement, and would probably have come here to
ask for more stack space, and the hard-wired 2048 test a little
further down would have decided that that was a wild stack access.

In short, my current belief is that Linux PPC64 fails when trying
to deliver a signal if there's right around 2KB of stack remaining,
even though it should be able to expand the stack and press on.

It may well be that the reason is just that this heuristic in
bad_stack_expansion() is out of date. Or there might be a similarly
bogus value somewhere in the signal-delivery code.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-10-12 22:23:46 Re: v12.0: ERROR: could not find pathkey item to sort
Previous Message Petr Jelinek 2019-10-12 20:01:55 Re: adding partitioned tables to publications