Re: Why is infinite_recurse test suddenly failing?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, mark(at)2ndQuadrant(dot)com, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Why is infinite_recurse test suddenly failing?
Date: 2019-05-10 19:35:17
Message-ID: 14080.1557516917@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2019-05-10 11:38:57 -0400, Tom Lane wrote:
>> I am wondering if, somehow, the stack depth limit seen by the postmaster
>> sometimes doesn't apply to its children. That would be pretty wacko
>> kernel behavior, especially if it's only intermittently true.
>> But we're running out of other explanations.

> I wonder if this is a SIGSEGV that actually signals an OOM
> situation. Linux, if it can't actually extend the stack on-demand due to
> OOM, sends a SIGSEGV. The signal has that information, but
> unfortunately the buildfarm code doesn't print it. p $_siginfo would
> show us some of that...

> Mark, how tight is the memory on that machine? Does dmesg have any other
> information (often segfaults are logged by the kernel with the code
> IIRC).

It does sort of smell like a resource exhaustion problem, especially
if all these buildfarm animals are VMs running on the same underlying
platform. But why would that manifest as "you can't have a measly two
megabytes of stack" and not as any other sort of OOM symptom?

Mark, if you don't mind modding your local copies of the buildfarm
script, I think what Andres is asking for is a pretty trivial addition
in PGBuild/Utils.pm's sub get_stack_trace:

my $cmdfile = "./gdbcmd";
my $handle;
open($handle, '>', $cmdfile) || die "opening $cmdfile: $!";
print $handle "bt\n";
+ print $handle "p $_siginfo\n";
close($handle);

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashwin Agrawal 2019-05-10 19:43:06 Re: Inconsistency between table am callback and table function names
Previous Message Andres Freund 2019-05-10 19:19:26 Re: What's the point of allow_system_table_mods?