Re: Maximum function call nesting depth for regression tests

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Maximum function call nesting depth for regression tests
Date: 2010-11-01 05:12:29
Message-ID: 13089.1288588349@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, Oct 30, 2010 at 10:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't especially care for choice #1. To me, one of the things that
>> the regression tests ought to flag is whether a machine is so limited
>> that "reasonable" coding might fail. If you can't do twenty or so
>> levels of function call you've got a mighty limited machine.

> Agreed. So how much stack space does 10 or 20 nested calls actually use?

I just did some testing with git HEAD on RHEL-5 machines (gcc 4.1.2).
It appears the actual stack consumption for one cycle (plpgsql to sql
back to plpgsql) is 4112 bytes on ia64, 4000 bytes on ppc32, 8256 bytes
on ppc64. Of course these numbers could be expected to vary some
depending on compiler version and options, but 4K to 8K looks like
the expected number.

Now the odd thing about this is that we're not running up against
the actual kernel stack limit, because we're not dumping core.
What we're hitting is the max_stack_depth check; and the reason
that is odd is that we *never* set max_stack_depth to less than 100kB,
no matter what insane reading we might get from getrlimit. So it seems
like there should be enough room for 10 of these cycles, certainly so
on the ia64 machine although maybe ppc64 is marginal. So I'm back to
suspecting funny business on the buildfarm machines.

Just for the record, this is the call stack cycle we're talking about:

#0 plpgsql_call_handler (fcinfo=0x60000fffffcbf9b0) at pl_handler.c:100
#1 0x40000000003ae660 in ExecMakeFunctionResult (fcache=0x60000000001e6180,
econtext=0x60000000001e5fc8,
isNull=0x60000000001e6bc8 "\177~\177\177\177\177\177\177\bW'",
isDone=0x60000000001e6d08) at execQual.c:1836
#2 0x40000000003a4c70 in ExecTargetList (projInfo=<value optimized out>,
isDone=0x60000fffffcbfd60) at execQual.c:5110
#3 ExecProject (projInfo=<value optimized out>, isDone=0x60000fffffcbfd60)
at execQual.c:5325
#4 0x40000000003e36d0 in ExecResult (node=<value optimized out>)
at nodeResult.c:155
#5 0x40000000003a3b10 in ExecProcNode (node=0x60000000001e5eb0)
at execProcnode.c:361
#6 0x40000000003d64f0 in ExecLimit (node=0x60000000001e5b70) at nodeLimit.c:89
#7 0x40000000003a3e50 in ExecProcNode (node=0x60000000001e5b70)
at execProcnode.c:480
#8 0x40000000003a0630 in ExecutePlan (queryDesc=<value optimized out>,
direction=<value optimized out>, count=1) at execMain.c:1236
#9 standard_ExecutorRun (queryDesc=<value optimized out>,
direction=<value optimized out>, count=1) at execMain.c:282
#10 0x40000000003c15b0 in postquel_getnext (fcinfo=0x60000fffffcbfe00)
at functions.c:475
#11 fmgr_sql (fcinfo=0x60000fffffcbfe00) at functions.c:704
#12 0x40000000003ae660 in ExecMakeFunctionResult (fcache=0x60000000001dde80,
econtext=0x60000000001ddc58,
isNull=0x60000000001df1a0 "\177~\177\177\177\177\177\177\030\356#",
isDone=0x60000000001df2e0) at execQual.c:1836
#13 0x40000000003a4c70 in ExecTargetList (projInfo=<value optimized out>,
isDone=0x60000fffffcc01b0) at execQual.c:5110
#14 ExecProject (projInfo=<value optimized out>, isDone=0x60000fffffcc01b0)
at execQual.c:5325
#15 0x40000000003e36d0 in ExecResult (node=<value optimized out>)
at nodeResult.c:155
#16 0x40000000003a3b10 in ExecProcNode (node=0x60000000001ddb40)
at execProcnode.c:361
#17 0x40000000003a0630 in ExecutePlan (queryDesc=<value optimized out>,
direction=<value optimized out>, count=2) at execMain.c:1236
#18 standard_ExecutorRun (queryDesc=<value optimized out>,
direction=<value optimized out>, count=2) at execMain.c:282
#19 0x40000000003fdc40 in _SPI_execute_plan (plan=<value optimized out>,
paramLI=0x60000000001afb60, snapshot=0x0, crosscheck_snapshot=0x0,
read_only=0 '\000', fire_triggers=<value optimized out>, tcount=2)
at spi.c:2092
#20 0x40000000003fe5e0 in SPI_execute_plan_with_paramlist (
plan=0x600000000020767c, params=0x60000000001afb60, read_only=0 '\000',
tcount=2) at spi.c:423
#22 0x2000000006193820 in exec_eval_expr (estate=0x60000fffffcc05a8,
expr=0x6000000000204480, isNull=0x60000fffffcc05b8 "\001",
rettype=0x60000fffffcc05bc) at pl_exec.c:4222
#23 0x200000000619fbf0 in exec_stmt (estate=0x60000fffffcc05a8,
stmts=<value optimized out>) at pl_exec.c:2148
#24 exec_stmts (estate=0x60000fffffcc05a8, stmts=<value optimized out>)
at pl_exec.c:1239
#25 0x20000000061a1a30 in exec_stmt_if (estate=0x60000fffffcc05a8,
stmt=<value optimized out>) at pl_exec.c:1479
#26 0x200000000619dd00 in exec_stmt (estate=0x60000fffffcc05a8,
stmts=<value optimized out>) at pl_exec.c:1288
#27 exec_stmts (estate=0x60000fffffcc05a8, stmts=<value optimized out>)
at pl_exec.c:1239
#28 0x200000000619c810 in exec_stmt_block (estate=0x60000fffffcbf9b0, block=0x0)
at pl_exec.c:1177
#29 0x20000000061a3660 in plpgsql_exec_function (func=0x60000000001547c8,
fcinfo=0x60000fffffcc09c0) at pl_exec.c:317
#30 0x2000000006186ae0 in plpgsql_call_handler (fcinfo=0x10000)
at pl_handler.c:122
#31 0x40000000003ae660 in ExecMakeFunctionResult (fcache=0x60000000001cd230,
econtext=0x60000000001cd078,

I haven't looked to see if any of these have an excessive amount of
local variables.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-11-01 06:28:32 Re: SR fails to send existing WAL file after off-line copy
Previous Message Robert Haas 2010-11-01 03:21:21 Re: SR fails to send existing WAL file after off-line copy