Re: Using real libpq parameters

From: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
To: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
Cc: psycopg(at)postgresql(dot)org
Subject: Re: Using real libpq parameters
Date: 2011-02-28 21:27:46
Message-ID: AANLkTikamsFRkyTexx+Hkin7FMWMcXgd00W9QZC528U1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

On Mon, Feb 28, 2011 at 6:49 PM, A.M. <agentm(at)themactionfaction(dot)com> wrote:
>
> On Feb 27, 2011, at 6:35 PM, Daniele Varrazzo wrote:

>> I see, but PQexecf is another variadic function, so my point is the
>> same. If you have arguments in variables a, b, c, you can call
>> PQputf("%text, %text, %text", a, b, c) and that's handy. But if you
>> have an array containing three variables v[] and its length L, how do
>> you pass them to PQputf?
>
> One calls it multiple times. Hopefully this example will clear up the confusion:
> [...]
>  PQputf(param,"%int4",5);    //calling PQputf multiple times
>  PQputf(param,"%text","spam");
>
>  PGresult *res=PQparamExec(conn,param,"SELECT $1,$2;",0);
> [...]

Yes it does, thank you.

>> I know about the suboptimal plan Postgres generates in case of
>> prepared statements. Does it apply for PQexecParams too or just for
>> PQprepare? I've asked on the -general about this.
>
> It applies to all prepared statements regardless of the API. However, as a special case, the special "" (empty string) prepared statement follows this rule:
> [...]
> http://www.postgresql.org/docs/9.0/interactive/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

This is interesting. This means that if we wanted to use PQprepare to
avoid repeated planning in executemany() we should set a non-empty
name for the statement. Good to know...

> libpqtypes supports ALL possible PostgreSQL types (user-defined as well). Right now, creating pycopg2 types involves parsing the textual representations from the result, however, using libpqtypes, psycopg2 could offer a much more natural API like this:
>
> class Point(object):,
>        def __init__(self,x,y):
>                self.x = x
>                self.y = y
> [...]
> psycopg2.register_type(Point) #psycopg2 extracts all properties as a tuple to pass to libpqtypes

FYI, in psycopg 2.4 we have added register_composite() that does about
the same, converting user-defined types into [named]tuples. Of course
it parses the textual representation. The Point was just an example
for a simple adapter, it wasn't meant as an example for composite
(e.g. if you wanted to write ad adapter for *yuck* money, that's the
recipe)

>>> So, to summarize, libpqtypes:

> The best thing about libpqtypes is that you don't need to learn about the binary protocol because it is all cleanly wrapped up. For example, the library has to account for endianness of the server and other hurdles. It is all handled transparently.

Yup, it seems the good way to go indeed if we wanted to introduce
binary communication. Note anyway that using PQexecParams doesn't
imply automatically using binary parameters, and not for all the data
types the benefit of binary is proven. It is definitely for bytea. I'd
argue that for text is the same. I've chatted with Christeas at FOSDEM
and he said (correct me If I'm wrong) after some benchmark he has made
that the performance with numbers were mixed: sending 8 bytes to
represent '3' is slower than sending and parsing the textual
representation. Datetime types would definitely benefit from binary,
but probably they are the only other types for which there is a sure
gain.

I see two sequential steps in this development: the first is to lay
out the ParamCursor (that should convert the python query into the
correct representation for PQexecParams)* and define the ISQLParam
protocol. Within this framework we could introduce the binary
protocol. So we could proceed:

1. text for everything except bytea (first step)
2a. binary only for selected data types, with our code, or
2b. binary for everything using libpqtypes

in case we went for 2b help from people "into" libpqtypes would be
greatly appreciated of course :)

* what query conversion? To make ParamCursor use easier (migrating
previous code, using python dicts) I envise ParamCursor to use the
current format ("%s") and pyformat ("%(name)s") not the libpq $1, $2
format. This will require some funny conversion of the query string:

- "SELECT %s, %s, %s" would become "SELECT $1, $2, $3"
- "SELECT %(foo)s, %(bar)s, %(foo)s" would become "SELECT $1, $2, $1"
plus a remapping of the parameters used in dict into a sequence of 2
items.

Cheers,

-- Daniele

In response to

Browse psycopg by date

  From Date Subject
Next Message Daniele Varrazzo 2011-02-28 23:07:39 Re: gmpy adapter
Previous Message A.M. 2011-02-28 18:49:38 Re: Using real libpq parameters