Skip site navigation (1) Skip section navigation (2)

Re: plpython SPI cursors

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Steve Singer <ssinger_pg(at)sympatico(dot)ca>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpython SPI cursors
Date: 2011-11-23 18:58:55
Message-ID: 4ECD426F.5060702@wulczer.org (view raw or flat)
Thread:
Lists: pgsql-hackers
On 20/11/11 19:14, Steve Singer wrote:
> On 11-10-15 07:28 PM, Jan Urbański wrote:
>> Hi,
>>
>> attached is a patch implementing the usage of SPI cursors in PL/Python.
>> Currently when trying to process a large table in PL/Python you have
>> slurp it all into memory (that's what plpy.execute does).
>>
>> J
>
> I found a few bugs (see my testing section below) that will need fixing
> + a few questions about the code

Responding now to all questions and attaching a revised patch based on 
your comments.

> Do we like the name plpy.cursor or would we rather call it something like
> plpy.execute_cursor(...) or plpy.cursor_open(...) or
> plpy.create_cursor(...)
> Since we will be mostly stuck with the API once we release 9.2 this is
> worth
> some opinions on. I like cursor() but if anyone disagrees now is the time.

We use plpy.subtransaction() to create Subxact objects, so I though 
plpy.cursor() would be most appropriate.

> This patch does not provide a wrapper around SPI_cursor_move. The patch
> is useful without that and I don't see anything that preculdes someone else
> adding that later if they see a need.

My idea is to add keyword arguments to plpy.cursor() that will allow you 
to decide whether you want a scrollable cursor and after that provide a 
move() method.

> The patch includes documentation updates that describes the new feature.
> The Database Access page doesn't provide a API style list of database
> access
> functions like the plperl
> http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page
> does. I think the organization of the perl page is
> clearer than the python one and we should think about a doing some
> documentaiton refactoring. That should be done as a seperate patch and
> shouldn't be a barrier to committing this one.

Yeah, the PL/Python docs are a bit chaotic right now. I haven't yet 
summoned force to overhaul them.

> in PLy_cursor_plan line 4080
> + PG_TRY();
> + {
> + Portal portal;
> + char *volatile nulls;
> + volatile int j;

> I am probably not seeing a code path or misunderstanding something
> about the setjmp/longjump usages but I don't see why nulls and j need to be
> volatile here.

It looked like you could drop volatile there (and in 
PLy_spi_execute_plan, where this is copied from (did I mention there's 
quite some code duplication in PL/Python?)) but digging in git I found 
this commit:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2789b7278c11785750dd9d2837856510ffc67000

that added the original volatile qualification, so I guess there's a reason.

> line 444
> PLy_cursor(PyObject *self, PyObject *args)
> + {
> + char *query;
> + PyObject *plan;
> + PyObject *planargs = NULL;
> +
> + if (PyArg_ParseTuple(args, "s", &query))
> + return PLy_cursor_query(query);
> +
>
> Should query be freed with PyMem_free()

No, PyArg_ParseTuple returns a string on the stack, I check that 
repeatedly creating a cursor with a plan argument does not leak memory 
and that adding PyMem_Free there promptly leads to a segfault.


> I tested both python 2.6 and 3 on a Linux system
>
> [test cases demonstrating bugs]

Turns out it's a really bad idea to store pointers to Portal structures, 
because they get invalidated by the subtransaction abort hooks.

I switched to storing the cursor name and looking it up in the 
appropriate hash table every time it's used. The examples you sent 
(which I included as regression tests) now cause a ValueError to be 
raised with a message stating that the cursor has been created in an 
aborted subtransaction.

Not sure about the wording of the error message, though.

Thanks again for the review!

Cheers,
Jan

Attachment: cursor-support-for-plpython-v2.patch
Description: text/x-diff (40.8 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Robert HaasDate: 2011-11-23 19:24:22
Subject: Re: Inlining comparators as a performance optimisation
Previous:From: Tom LaneDate: 2011-11-23 18:30:18
Subject: Re: Not HOT enough

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group