Re: Endless loop calling PL/Python set returning functions

From: Alexey Grishchenko <agrishchenko(at)pivotal(dot)io>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Endless loop calling PL/Python set returning functions
Date: 2016-03-11 10:09:03
Message-ID: CAH38_tkJx_-Pf2PBHDaCJ=VoLsg=vDniJpNS9WVyOt+PU6j8Fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexey Grishchenko <agrishchenko(at)pivotal(dot)io> wrote:

> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Alexey Grishchenko <agrishchenko(at)pivotal(dot)io> writes:
>> > No, my fix handles this well.
>> > In fact, with the first function call you allocate global variables
>> > representing Python function input parameters, call the function and
>> > receive iterator over the function results. Then in a series of Postgres
>> > calls to PL/Python handler you just fetch next value from the iterator,
>> you
>> > are not calling the Python function anymore. When the iterator reaches
>> the
>> > end, PL/Python call handler deallocates the global variable representing
>> > function input parameter.
>>
>> > Regardless of the number of parallel invocations of the same function,
>> each
>> > of them in my patch would set its own input parameters to the Python
>> > function, call the function and receive separate iterators. When the
>> first
>> > function's result iterator would reach its end, it would deallocate the
>> > input global variable. But it won't affect other functions as they no
>> > longer need to invoke any Python code.
>>
>> Well, if you think that works, why not undo the global-dictionary changes
>> at the end of the first call, rather than later? Then there's certainly
>> no overlap in their lifespan.
>>
>> regards, tom lane
>>
>
> Could you elaborate more on this? In general, stack-like solution would
> work - if before the function call there is a global variable with the name
> matching input variable name, push its value to the stack, and pop it after
> the function execution. Would implement it tomorrow and see how it works
>
>
> --
>
> Sent from handheld device
>

I have improved the code using proposed approach. The second version of
patch is in attachment

It works in a following way - the procedure object PLyProcedure stores
information about the call stack depth (calldepth field) and the stack
itself (argstack field). When the call stack depth is zero we don't make
any additional processing, i.e. there won't be any performance impact for
existing enduser functions. Stack manipulations are put in action only when
the calldepth is greater than zero, which can be achieved either when the
function is called recursively with SPI, or when you are calling the same
set-returning function in a single query twice or more.

Example of multiple calls to SRF within a single function:

CREATE OR REPLACE FUNCTION func(iter int) RETURNS SETOF int AS $$
return xrange(iter)
$$ LANGUAGE plpythonu;

select func(3), func(4);

Before the patch query caused endless loop finishing with OOM. Now it works
as it should

Example of recursion with SPI:

CREATE OR REPLACE FUNCTION test(a int) RETURNS int AS $BODY$
r = 0
if a > 1:
r = plpy.execute("SELECT test(%d) as a" % (a-1))[0]['a']
return a + r
$BODY$ LANGUAGE plpythonu;

select test(10);

Before the patch query failed with "NameError: global name 'a' is not
defined". Now it works correctly and returns 55

--
Best regards,
Alexey Grishchenko

Attachment Content-Type Size
0002-Fix-endless-loop-in-plpython-set-returning-function.patch application/octet-stream 12.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-03-11 10:15:57 Re: Refectoring of receivelog.c
Previous Message Mithun Cy 2016-03-11 10:04:48 Re: Explain [Analyze] produces parallel scan for select Into table statements.