Re: Proposal: efficient iter on named cursors

From: Federico Di Gregorio <federico(dot)digregorio(at)dndg(dot)it>
To: psycopg(at)postgresql(dot)org
Subject: Re: Proposal: efficient iter on named cursors
Date: 2011-01-13 16:46:41
Message-ID: 4D2F2C71.8080805@dndg.it
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: psycopg

On 13/01/2011 17:37, Daniele Varrazzo wrote:
[snip]
> There is a shortcoming though: iter(cursor) will fetch the records one
> at a time, with a noticeable time overhead in case of large recordsets
> (exactly the ones you may want to retrieve with a named cursors...)
> Currently the most efficient way to iterate on a named cursor is
> something like:
>
> nrecs = 100 # or some other reasonable number
> while 1:
> recs = cur.fetchmany(nrecs)
> if not recs:
> break
> for rec in recs:
> # do something
>
> This would use only the memory used by nrecs record on the client and
> require just 1/nrecs of the roundtrips required by a naive operation.
> But it make the named cursors harder to use and not a drop-in
> replacement for regular cursors that can be idiomatically used with:
>
> for rec in cur:
> # do something
>
> So, I'd like to modify the cursor so that in case of __iter__, a
> certain number of record is fetched and iteration is performed on
> them. The cursor already has the state to keep the dataset so probably
> only the code would require change, not so much the data structures.
>
> How do we make the users choose their nrecs? I think the cursor should
> have an attribute with a sensible default: 100? 1000? 1024? What
> attribute name?
[snip]
> Comments?

I mostly agree. I'd like to see a .fetchsize parameter both on the
module, on the connection and on the cursor (usual psycopg cascade
initialization). A good default value, supposing rows of 10-20 columns
with a mix of textual and numeric data, is ~2000 because it will
probably fetch less than 100KB of data and that seems a good compromise
between the number of rows transferred and data you'll throw away if you
decide to leave the loop early.

federico

--
Federico Di Gregorio federico(dot)digregorio(at)dndg(dot)it
Studio Associato Di Nunzio e Di Gregorio http://dndg.it
If nobody understand you, that doesn't mean you're an artist.
-- anonymous

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Daniele Varrazzo 2011-01-13 17:06:19 Please help fixing a couple of zope-related issues
Previous Message Daniele Varrazzo 2011-01-13 16:37:47 Proposal: efficient iter on named cursors