Re: plpython SPI cursors

From: Steve Singer <ssinger_pg(at)sympatico(dot)ca>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpython SPI cursors
Date: 2011-11-26 16:21:14
Message-ID: BLU0-SMTP2363BD56F6BB1AD3E704448ECC0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11-11-23 01:58 PM, Jan Urbański wrote:
> On 20/11/11 19:14, Steve Singer wrote:
>> On 11-10-15 07:28 PM, Jan Urbański wrote:
>>> Hi,
>>>
>>> attached is a patch implementing the usage of SPI cursors in PL/Python.
>>> Currently when trying to process a large table in PL/Python you have
>>> slurp it all into memory (that's what plpy.execute does).
>>>
>>> J
>>
>> I found a few bugs (see my testing section below) that will need fixing
>> + a few questions about the code
>
> Responding now to all questions and attaching a revised patch based on
> your comments.
>

I've looked over the revised version of the patch and it now seems fine.

Ready for committer.

>> Do we like the name plpy.cursor or would we rather call it something
>> like
>> plpy.execute_cursor(...) or plpy.cursor_open(...) or
>> plpy.create_cursor(...)
>> Since we will be mostly stuck with the API once we release 9.2 this is
>> worth
>> some opinions on. I like cursor() but if anyone disagrees now is the
>> time.
>
> We use plpy.subtransaction() to create Subxact objects, so I though
> plpy.cursor() would be most appropriate.
>
>> This patch does not provide a wrapper around SPI_cursor_move. The patch
>> is useful without that and I don't see anything that preculdes
>> someone else
>> adding that later if they see a need.
>
> My idea is to add keyword arguments to plpy.cursor() that will allow
> you to decide whether you want a scrollable cursor and after that
> provide a move() method.
>
>> The patch includes documentation updates that describes the new feature.
>> The Database Access page doesn't provide a API style list of database
>> access
>> functions like the plperl
>> http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page
>> does. I think the organization of the perl page is
>> clearer than the python one and we should think about a doing some
>> documentaiton refactoring. That should be done as a seperate patch and
>> shouldn't be a barrier to committing this one.
>
> Yeah, the PL/Python docs are a bit chaotic right now. I haven't yet
> summoned force to overhaul them.
>
>> in PLy_cursor_plan line 4080
>> + PG_TRY();
>> + {
>> + Portal portal;
>> + char *volatile nulls;
>> + volatile int j;
>
>> I am probably not seeing a code path or misunderstanding something
>> about the setjmp/longjump usages but I don't see why nulls and j need
>> to be
>> volatile here.
>
> It looked like you could drop volatile there (and in
> PLy_spi_execute_plan, where this is copied from (did I mention there's
> quite some code duplication in PL/Python?)) but digging in git I found
> this commit:
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000
>
>
> that added the original volatile qualification, so I guess there's a
> reason.
>
>> line 444
>> PLy_cursor(PyObject *self, PyObject *args)
>> + {
>> + char *query;
>> + PyObject *plan;
>> + PyObject *planargs = NULL;
>> +
>> + if (PyArg_ParseTuple(args, "s", &query))
>> + return PLy_cursor_query(query);
>> +
>>
>> Should query be freed with PyMem_free()
>
> No, PyArg_ParseTuple returns a string on the stack, I check that
> repeatedly creating a cursor with a plan argument does not leak memory
> and that adding PyMem_Free there promptly leads to a segfault.
>
>
>> I tested both python 2.6 and 3 on a Linux system
>>
>> [test cases demonstrating bugs]
>
> Turns out it's a really bad idea to store pointers to Portal
> structures, because they get invalidated by the subtransaction abort
> hooks.
>
> I switched to storing the cursor name and looking it up in the
> appropriate hash table every time it's used. The examples you sent
> (which I included as regression tests) now cause a ValueError to be
> raised with a message stating that the cursor has been created in an
> aborted subtransaction.
>
> Not sure about the wording of the error message, though.
>
> Thanks again for the review!
>
> Cheers,
> Jan
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-11-26 17:13:47 Re: Avoiding repeated snapshot computation
Previous Message Andrew Dunstan 2011-11-26 16:02:10 Re: psql setenv command