Incremental results from libpq

From: Scott Lamb <slamb(at)slamb(dot)org>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: Incremental results from libpq
Date: 2005-11-09 19:36:29
Message-ID: 4D426B54-E08F-4DE9-9776-D46610CA4968@slamb.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

I'm using libpq's asynchronous interface. I'd like to do the
equivalent of setting fetchSize on the JDBC driver - get a chunk of
results, handle them, free the memory they use, and then get another
until I've handled an entire query.

I can do this at the SQL layer using "declare cursor ..." and "fetch
forward n ..." but it seems like the lower level should be able to do
this for me. It'd also let me have a more natural interface that (A)
doesn't make the caller take a PostgreSQL-specific declare/fetch path
for each query (B) can still use the JDBC-style "execute" that
doesn't care if it's dealing with a row-returning statement.

I see that JDBC driver (at least in protocol version 3; I don't care
about 2) does this by passing a maximum number of rows when sending
Execute, then handling PortalSuspended and Execute again. I also see
that libpq never sends a maximum number of rows or handles
PortalSuspended.

Still, I think it should be able to do what I want. The results are
sent from the database in order. This message type would be necessary
to ensure the database sends no more than N rows, but it's not
necessary to ensure the client handles N rows as soon as it has them.

I had been retrieving results from a query in this fashion:

while True:
readfds = [PQsocket(conn)]
writefds = []
if PQflush(conn):
writefds = readfds
if PQconsumeInput(conn):
error
if not PQisBusy(conn):
break
poll(readfds, writefds)
return PQgetResult(conn)

which the documentation recommends. But PQisBusy(conn) doesn't return
false until the _entire_ resultset has been retrieved from the
server. And if I skip PQisBusy(conn) and go straight for the
PQgetResult(), it blocks internally until it can complete.

I looked inside libpq, and came up with this instead:

while True:
readfds = [PQsocket(conn)]
writefds = []
if PQflush(conn):
writefds = readfds
if PQconsumeInput(conn):
error
if PQisBusy(conn):
break
if conn->result != NULL and PQntuples(conn->result) >
retrieved:
return conn->result
poll(readfds, writefds)
last = True
return PQgetResult(conn)

where "retrieved" is the number of rows I've examined so far, and
"last" indicates that I shouldn't call again.

which is 1/3rd right:

- It does return results incrementally; good.
- It pokes inside libpq; ugh.
- It doesn't free any memory until the whole query's done. I suppose
I could do that by changing conn->result myself, but...ugh. Is there
a better way?

--
Scott Lamb <http://www.slamb.org/>

Responses

Browse pgsql-interfaces by date

  From Date Subject
Next Message Tom Lane 2005-11-09 21:22:08 Re: Incremental results from libpq
Previous Message Michael Fuhr 2005-11-09 18:05:05 Re: PLPythonU & Out of Memory - Importing Query