Using real libpq parameters

From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: psycopg(at)postgresql(dot)org
Subject: Using real libpq parameters
Date: 2011-02-26 23:40:59
Message-ID: AANLkTi=ym3SCQKCQBtp8RJHUswwAPOpjXYKTXS=aHWzp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

Hello,

I've tried to make order in the ideas about pros and cons of using
real query parameters in libpq. Here is my analysis and a possible
plan.

Currently Psycopg uses only the PQexec [1] libpq function to send
queries to the backend: when a query with parameters is executed,
psycopg adapts every parameter value to a sql literal (the protocol
name is ISQLQuote) [2], merges the parameters to the query and sends
everything to the backend using PQexec. This strategy has largely
proven satisfying, providing the greatest flexibility for the adapter
to write the query.

Using PQexecParams [3] instead of PQexec would have several problems
and the behavior of Psycopg would change in a non-backward compatible
way. PQexecParams only accepts one non-empty command while
cur.execute() allows any number of queries separated by semicolons.
Also passing complex data types seems more... complex. An example is
with the array: because psycopg sends a complete query to the backend,
it can use the array[...] syntax where every element is escaped in a
standard way [4] but the array[] constructor doesn't seem accepted as
a parameter. In other places psycopg uses constructors that wouldn't
be valid as parameter values (e.g. '2010-01-01'::date for date
representation): in some cases the python type could be converted into
the proper paramTypes entry for disambiguation (in execute, but
probably not in executemany). The "default" literal is another handy
feature not accepted as parameter. The bottom line is that adapting
Python values for PQexec and for PQexecParams are two entirely
different businesses and adapters currently implemented for ISQLQuote
wouldn't be compatible.

So we have incompatibilities in the queries that can be executed and
in the adapters that users may have extended. I think this rules out
entirely the idea of using PQexecParams as the default exec function,
at least until psycopg3000. What would be the advantages? One is that
large strings or binary values would be treated more efficiently in
the backend, as they wouldn't bloat the parser. Another is that we may
start supporting prepared statements: I would avoid an explicit
.prepare() method exposed to Python (as the PREPARE/EXECUTE sql
statements can be already used for this task), but for instance
.executemany() could use internally PQprepare/PQexecPrepared, whose
interface is similar to PQexecParams.

Luckily (well, it wasn't luck at all: thanks to Federico's design)
psycopg already has two extension mechanisms to cope with having both
the execution methods: one is the possibility to subclass
connections/cursors and thus have a cursor supporting PQexecParams in
execute() and PQprepare in executemany() (let's call it ParamCursor).
Another one is the completely generic adaptation mechanism of Python
objects [5]: there could be a separate protocol (let's say ISQLParam)
to which python object passed to the query should be conformed:
conforming object may expose different methods from the ones required
by ISQLQuote, that may be needed to interface with PQexecParam.

Keeping the two protocols well separated would also allow us to use
the PQexec-based side in ways we preferred to avoid before. For
instance there's often been the request of a way to pass identifiers
to a query e.g. to represents table or field names: I remember the
request being discarded precisely on the basis that switching to
"proper parameters" the method would have broke. Keeping two protocols
we may have, for instance, an "Identifier" wrapper allowing
execute("insert into %s values (%s)", [Identifier('foo'), 'bar']): the
method would only work on the classic cursor, not the param-based one,
and this would be implemented by making the Identifier object conform
to ISQLQuote but not to ISQLParam.

PQexecParams also allows binary representation for the parameters. I
know they would be more efficient than textual types, but it is an
implementation leak probably useful for a specific program talking to
a specific server, but not for a generic library. So I would rule out
its usage for every data type except for bytea.

This of course will not be implemented tonight and won't be released
in Psycopg 2.4 :)

Comments and discussion are appreciated. Cheers.

-- Daniele

[1] http://www.postgresql.org/docs/9.0/static/libpq-exec.html#LIBPQ-PQEXEC
[2] http://initd.org/psycopg/docs/extensions.html#sql-adaptation-protocol-objects
[3] http://www.postgresql.org/docs/9.0/static/libpq-exec.html#LIBPQ-PQEXECPARAMS
[4] http://www.postgresql.org/docs/9.0/static/arrays.html#ARRAYS-IO
[5] http://www.python.org/dev/peps/pep-0246/

Responses

Browse psycopg by date

  From Date Subject
Next Message A.M. 2011-02-26 23:47:22 Re: Using real libpq parameters
Previous Message Daniele Varrazzo 2011-02-26 18:03:00 Re: Binary Transfer / Types