[RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)

From: Manlio Perillo <manlio(dot)perillo(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [RFC] ideas for a new Python DBAPI driver (was Re: libpq test suite)
Date: 2013-02-14 14:23:15
Message-ID: 511CF353.1040901@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 14/02/2013 14:06, Albe Laurenz ha scritto:
> Manlio Perillo wrote:
>> Sorry for the question, but where can I find the libpq test suite?
>> I can not find it in the PostgreSQL sources; it seems that there are
>> only some examples, in src/test/examples.
>
> The regression tests are in src/interfaces/libpq/test
> and currently contain only URL parsing tests.
>

Ok, thanks.

Since I'm not sure if I should add a new test here, I'll use the test
suite of my project, since it contains an (almost) 1:1 wrapper around libpq.

>> I'm planning to add some new features to libpq:
>>
>> * make PQsendPrepare send a "Describe Portal" protocol message
>> * add support for setting per column binary result format
>
> I suggested exactly that here:
> http://www.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C208A4EDD4@exadv11.host.magwien.gv.at
> and met resistance:
> - one can use libpqtypes
> - I couldn't find a convincing use case
> - it clutters up the API
>

For my Python DBAPI2 PostgreSQL driver I plan the following optimizations:

1) always use PQsendQueryParams functions.

This will avoid having to escape parameters, as it is done in
psycopg2
(IMHO it still use simple query protocol for compatibility purpose)

2) when the driver detects a Python string is being sent to the
database, use binary format.

As a special case, this will avoid having to use PQescapeByteaConn
when sending binary string (e.g. byte strings in Python 3.x)

3) enable use of prepared statements, but only if the user requested it,
using setinputsizes function (used to set the Oids of the parameters)

4) when using a prepared statement, check the Oids of the result tuple.

In order to make this efficient, I proposed a patch to send a
Describe Portal message in PQsendPrepare function.

When the driver detects that one of the result column is a string
type, set the result format for that column to binary.

As a special case, this will avoid having to use PQunescapeBytea
when receiving a bytea data.

This is currently impossible, using libpq API.

5) when returning the result set of a query, after a call to
cursor.fetchall(), do not convert all the data to Python objects.

This will be done only "on request".

This should optimize memory usage, as reported in:
http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO

6) make available the use of PQsetSingleRowMode, to optimize large
result set (as an option to the connection.cursor method)

7) as a generalization of PQsetSingleRowMode, expose in libpq API some
of protocol internal portal API.

One possible idea is to add a PQsetRowSize function, that will set
the size of the result set, to be used in the Execute protocol
message (currently libpq always set it to 0, to get the entire
result set, and it does not support the Portal Suspended message)

This will avoid having to use named cursor, as it is done in psycopg.

I'll try to make a patch to check if this is feasible, can be
done efficiently, and the new API has a minimal impact on existing
API

Note that I will have to code these features, in order to check they
will work as I expect.

> [...]
>>
>> [1] A new Python PostgreSQL driver, implemented following
>> http://wiki.postgresql.org/wiki/Driver_development
>> and with many optimization (compared to psycopg2) enabled by the
>> use of the extended query protocol
>
> I think that you'll need to explain in more detail why
> your proposed additions would be necessary for your project.
> Especially since many good drivers have been written against
> libpq as it is.
>
> Yours,
> Laurenz Albe
>

Thanks Manlio Perillo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlEc81MACgkQscQJ24LbaURO9ACfctOREoaAtMDm06Sg+qv5jesj
iW0An1CVAOaHzYaSn+P1AIJvXpI7nVT0
=rK4j
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-02-14 14:24:35 Re: proposal or just idea for psql - show first N rows from relation backslash statement
Previous Message Stephen Frost 2013-02-14 13:31:49 Re: proposal or just idea for psql - show first N rows from relation backslash statement