From: | Alexey Grishchenko <agrishchenko(at)pivotal(dot)io> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Endless loop calling PL/Python set returning functions |
Date: | 2016-03-10 16:20:10 |
Message-ID: | CAH38_tkimV2nJu13M8wZGFFDv-4riLB_LB0Zd2hKVCLRTHcXDw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I agree that passing function parameters through globals is not the best
solution
It works in a following way - executing custom code (in our case Python
function invocation) in Python is made with PyEval_EvalCode
<https://docs.python.org/2/c-api/veryhigh.html>. As an input to this C
function you specify dictionary of globals that would be available to this
code. The structure PLyProcedure stores "PyObject *globals;", which is the
dictionary of globals for specific function. So SPI works pretty fine, as
each function has a separate dictionary of globals and they don't conflict
with each other
One scenario when the problem occurs, is when you are calling the same
set-returning function in a single query twice. This way they share the
same "globals" which is not a bad thing, but when one function finishes
execution and deallocates input parameter's global, the second will fail
trying to do the same. I included the fix for this problem in my patch
The second scenario when the problem occurs is when you want to call the
same PL/Python function in recursion. For example, this code will not work:
create or replace function test(a int) returns int as $BODY$
r = 0
if a > 1:
r = plpy.execute("SELECT test(%d) as a" % (a-1))[0]['a']
return a + r
$BODY$ language plpythonu;
select test(10);
The function "test" has a single PLyProcedure object allocated to handle
it, thus it has a single "globals" dictionary. When internal function call
finishes, it removes the key "a" from the dictionary, and the outer
function fails with "NameError: global name 'a' is not defined" when it
tries to execute "return a + r"
But the second issue is a separate story and I think it is worth a separate
patch
On Thu, Mar 10, 2016 at 3:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexey Grishchenko <agrishchenko(at)pivotal(dot)io> writes:
> > There is a bug in implementation of set-returning functions in PL/Python.
> > When you call the same set-returning function twice in a single query,
> the
> > executor falls to infinite loop which causes OOM.
>
> Ugh.
>
> > Another issue with calling the same set-returning function twice in the
> > same query, is that it would delete the input parameter of the function
> > from the global variables dictionary at the end of execution. With
> calling
> > the function twice, this code attempts to delete the same entry from
> global
> > variables dict twice, thus causing KeyError. This is why the
> > function PLy_function_delete_args is modified as well to check whether
> the
> > key we intend to delete is in the globals dictionary.
>
> That whole business with putting a function's parameters into a global
> dictionary makes me itch. Doesn't it mean problems if one plpython
> function calls another (presumably via SPI)?
>
> regards, tom lane
>
--
Best regards,
Alexey Grishchenko
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2016-03-10 16:30:51 | Re: Add generate_series(date,date) and generate_series(date,date,integer) |
Previous Message | Robert Haas | 2016-03-10 16:09:13 | Re: Explain [Analyze] produces parallel scan for select Into table statements. |