| PostgreSQL 9.2devel Documentation | ||||
|---|---|---|---|---|
| Prev | Up | Chapter 42. PL/Python - Python Procedural Language | Next | |
The PL/Python language module automatically imports a Python module called plpy. The functions and constants in this module are available to you in the Python code as plpy.foo.
The plpy module provides two
functions called execute and
prepare. Calling plpy.execute with a query string and an
optional limit argument causes that query to be run and the
result to be returned in a result object. The result object
emulates a list or dictionary object. The result object can be
accessed by row number and column name. It has these additional
methods: nrows which returns the
number of rows returned by the query, status which is the SPI_execute() return value, colnames which is the list of column names,
coltypes which is the list of
column type OIDs, and coltypmods
which is the list of type-specific type modifiers for the
columns. The result object can be modified.
Note that calling plpy.execute will cause the entire result set to be read into memory. Only use that function when you are sure that the result set will be relatively small. If you don't want to risk excessive memory usage when fetching large results, use plpy.cursor rather than plpy.execute.
For example:
rv = plpy.execute("SELECT * FROM my_table", 5)
returns up to 5 rows from my_table. If my_table has a column my_column, it would be accessed as:
foo = rv[i]["my_column"]
The second function, plpy.prepare, prepares the execution plan for
a query. It is called with a query string and a list of
parameter types, if you have parameter references in the query.
For example:
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
text is the type of the variable
you will be passing for $1. After
preparing a statement, you use the function plpy.execute to run it:
rv = plpy.execute(plan, [ "name" ], 5)
The third argument is the limit and is optional.
Query parameters and result row fields are converted between PostgreSQL and Python data types as described in Section 42.3. The exception is that composite types are currently not supported: They will be rejected as query parameters and are converted to strings when appearing in a query result. As a workaround for the latter problem, the query can sometimes be rewritten so that the composite type result appears as a result row rather than as a field of the result row. Alternatively, the resulting string could be parsed apart by hand, but this approach is not recommended because it is not future-proof.
When you prepare a plan using the PL/Python module it is automatically saved. Read the SPI documentation (Chapter 43) for a description of what this means. In order to make effective use of this across function calls one needs to use one of the persistent storage dictionaries SD or GD (see Section 42.4). For example:
CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
plan = SD.setdefault("plan", plpy.prepare("SELECT 1"))
# rest of function
$$ LANGUAGE plpythonu;
The plpy.cursor function accepts the same arguments as plpy.execute (except for limit) and returns a cursor object, which allows you to process large result sets in smaller chunks. As with plpy.execute, either a query string or a plan object along with a list of arguments can be used. The cursor object provides a fetch method that accepts an integer parameter and returns a result object. Each time you call fetch, the returned object will contain the next batch of rows, never larger than the parameter value. Once all rows are exhausted, fetch starts returning an empty result object. Cursor objects also provide an iterator interface, yielding one row at a time until all rows are exhausted. Data fetched that way is not returned as result objects, but rather as dictionaries, each dictionary corresponding to a single result row.
Cursors are automatically disposed of. But if you want to explicitly release all resources held by a cursor, use the close method. Once closed, a cursor cannot be fetched from anymore.
Tip: Do not confuse objects created by plpy.cursor with DB-API cursors as defined by the Python Database API specification. They don't have anything in common except for the name.
An example of two ways of processing data from a large table is:
CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
odd = 0
for row in plpy.cursor("select num from largetable"):
if row['num'] % 2:
odd += 1
return odd
$$ LANGUAGE plpythonu;
CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
odd = 0
cursor = plpy.cursor("select num from largetable")
while True:
rows = cursor.fetch(batch_size)
if not rows:
break
for row in rows:
if row['num'] % 2:
odd += 1
return odd
$$ LANGUAGE plpythonu;
CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
odd = 0
plan = plpy.prepare("select num from largetable where num % $1 <> 0", ["integer"])
rows = list(plpy.cursor(plan, [2]))
return len(rows)
$$ LANGUAGE plpythonu;
Functions accessing the database might encounter errors,
which will cause them to abort and raise an exception. Both
plpy.execute and plpy.prepare can raise an instance of a
subclass of plpy.SPIError, which by
default will terminate the function. This error can be handled
just like any other Python exception, by using the try/except construct. For example:
CREATE FUNCTION try_adding_joe() RETURNS text AS $$
try:
plpy.execute("INSERT INTO users(username) VALUES ('joe')")
except plpy.SPIError:
return "something went wrong"
else:
return "Joe added"
$$ LANGUAGE plpythonu;
The actual class of the exception being raised corresponds to the specific condition that caused the error. Refer to Table A-1 for a list of possible conditions. The module plpy.spiexceptions defines an exception class for each PostgreSQL condition, deriving their names from the condition name. For instance, division_by_zero becomes DivisionByZero, unique_violation becomes UniqueViolation, fdw_error becomes FdwError, and so on. Each of these exception classes inherits from SPIError. This separation makes it easier to handle specific errors, for instance:
CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
from plpy import spiexceptions
try:
plan = plpy.prepare("INSERT INTO fractions (frac) VALUES ($1 / $2)", ["int", "int"])
plpy.execute(plan, [numerator, denominator])
except spiexceptions.DivisionByZero:
return "denominator cannot equal zero"
except spiexceptions.UniqueViolation:
return "already have that fraction"
except plpy.SPIError, e:
return "other error, SQLSTATE %s" % e.sqlstate
else:
return "fraction inserted"
$$ LANGUAGE plpythonu;
Note that because all exceptions from the plpy.spiexceptions module inherit from SPIError, an except clause handling it will catch any database access error.
As an alternative way of handling different error conditions, you can catch the SPIError exception and determine the specific error condition inside the except block by looking at the sqlstate attribute of the exception object. This attribute is a string value containing the "SQLSTATE" error code. This approach provides approximately the same functionality