Re: Why is infinite_recurse test suddenly failing?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, mark(at)2ndQuadrant(dot)com
Subject: Re: Why is infinite_recurse test suddenly failing?
Date: 2019-05-10 18:27:07
Message-ID: 20190510182707.l6xiar3s62nsznvg@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-10 11:38:57 -0400, Tom Lane wrote:
> Core was generated by `postgres: debian regression [local] SELECT '.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 sysmalloc (nb=8208, av=0x3fff916e0d28 <main_arena>) at malloc.c:2748
> 2748 malloc.c: No such file or directory.
> #0 sysmalloc (nb=8208, av=0x3fff916e0d28 <main_arena>) at malloc.c:2748
> #1 0x00003fff915bedc8 in _int_malloc (av=0x3fff916e0d28 <main_arena>, bytes=8192) at malloc.c:3865
> #2 0x00003fff915c1064 in __GI___libc_malloc (bytes=8192) at malloc.c:2928
> #3 0x00000000106acfd8 in AllocSetContextCreateInternal (parent=0x1000babdad0, name=0x1085508c "inline_function", minContextSize=<optimized out>, initBlockSize=<optimized out>, maxBlockSize=8388608) at aset.c:477
> #4 0x00000000103d5e00 in inline_function (funcid=20170, result_type=<optimized out>, result_collid=<optimized out>, input_collid=<optimized out>, funcvariadic=<optimized out>, func_tuple=<optimized out>, context=0x3fffe3da15d0, args=<optimized out>) at clauses.c:4459
> #5 simplify_function (funcid=<optimized out>, result_type=<optimized out>, result_typmod=<optimized out>, result_collid=<optimized out>, input_collid=<optimized out>, args_p=<optimized out>, funcvariadic=<optimized out>, process_args=<optimized out>, allow_non_const=<optimized out>, context=<optimized out>) at clauses.c:4040
> #6 0x00000000103d2e74 in eval_const_expressions_mutator (node=0x1000babe968, context=0x3fffe3da15d0) at clauses.c:2474
> #7 0x00000000103511bc in expression_tree_mutator (node=<optimized out>, mutator=0x103d2b10 <eval_const_expressions_mutator>, context=0x3fffe3da15d0) at nodeFuncs.c:2893

> So that lets out any theory that somehow we're getting into a weird
> control path that misses calling check_stack_depth;
> expression_tree_mutator does so for one, and it was called just nine
> stack frames down from the crash.

Right. There's plenty places checking it...

> I am wondering if, somehow, the stack depth limit seen by the postmaster
> sometimes doesn't apply to its children. That would be pretty wacko
> kernel behavior, especially if it's only intermittently true.
> But we're running out of other explanations.

I wonder if this is a SIGSEGV that actually signals an OOM
situation. Linux, if it can't actually extend the stack on-demand due to
OOM, sends a SIGSEGV. The signal has that information, but
unfortunately the buildfarm code doesn't print it. p $_siginfo would
show us some of that...

Mark, how tight is the memory on that machine? Does dmesg have any other
information (often segfaults are logged by the kernel with the code
IIRC).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2019-05-10 18:51:10 Re: What's the point of allow_system_table_mods?
Previous Message Andres Freund 2019-05-10 18:06:12 Re: What's the point of allow_system_table_mods?