Endless loop calling PL/Python set returning functions

From: Alexey Grishchenko <agrishchenko(at)pivotal(dot)io>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Endless loop calling PL/Python set returning functions
Date: 2016-03-10 14:07:28
Message-ID: CAH38_tnKLFvCCbiaaQmVhsvboher4Vqv680FqR8QEkWfSO2CXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

There is a bug in implementation of set-returning functions in PL/Python.
When you call the same set-returning function twice in a single query, the
executor falls to infinite loop which causes OOM. Here is a simple
reproduction for this issue:

CREATE OR REPLACE FUNCTION func(iter int) RETURNS SETOF int AS $$
return xrange(iter)
$$ LANGUAGE plpythonu;

select func(3), func(4);

The endless loop is caused by the fact that PL/Python uses PLyProcedure
structure for each of the functions, containing information specific for
the function. This structure is used to store the result set iterator
returned by the Python function call. But in fact, when we call the same
function twice, PL/Python uses the same structure for both calls, and the
same result set iterator (PLyProcedure.setof), which is being constantly
updated by one function after another. When the iterator reaches the end,
the first function sets it to null. Then Postgres calls the second
function, it receives NULL iterator and calls Python function once again,
receiving another iterator. This is an endless loop

In fact, for set-returning functions in Postgres we should use a set
of SRF_* functions, which gives us an access to function call context
(FuncCallContext). In my implementation this context is used to store the
iterator for function result set, so these two calls would have separate
iterators and the query would succeed.

Another issue with calling the same set-returning function twice in the
same query, is that it would delete the input parameter of the function
from the global variables dictionary at the end of execution. With calling
the function twice, this code attempts to delete the same entry from global
variables dict twice, thus causing KeyError. This is why the
function PLy_function_delete_args is modified as well to check whether the
key we intend to delete is in the globals dictionary.

New regression test is included in the patch.

--
Best regards,
Alexey Grishchenko

Attachment Content-Type Size
0001-Fix-endless-loop-in-plpython-set-returning-function.patch application/octet-stream 7.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-03-10 14:08:39 Re: Parallel Aggregate
Previous Message Robert Haas 2016-03-10 13:55:52 Re: Parallel Aggregate