Skip site navigation (1) Skip section navigation (2)

Re: Proposal: efficient iter on named cursors

From: Federico Di Gregorio <federico(dot)digregorio(at)dndg(dot)it>
To: psycopg(at)postgresql(dot)org
Subject: Re: Proposal: efficient iter on named cursors
Date: 2011-01-13 16:46:41
Message-ID: 4D2F2C71.8080805@dndg.it (view raw or flat)
Thread:
Lists: psycopg
On 13/01/2011 17:37, Daniele Varrazzo wrote:
[snip]
> There is a shortcoming though: iter(cursor) will fetch the records one
> at a time, with a noticeable time overhead in case of large recordsets
> (exactly the ones you may want to retrieve with a named cursors...)
> Currently the most efficient way to iterate on a named cursor is
> something like:
> 
>     nrecs = 100 # or some other reasonable number
>     while 1:
>         recs = cur.fetchmany(nrecs)
>         if not recs:
>             break
>         for rec in recs:
>             # do something
> 
> This would use only the memory used by nrecs record on the client and
> require just 1/nrecs of the roundtrips required by a naive operation.
> But it make the named cursors harder to use and not a drop-in
> replacement for regular cursors that can be idiomatically used with:
> 
>     for rec in cur:
>         # do something
> 
> So, I'd like to modify the cursor so that in case of __iter__, a
> certain number of record is fetched and iteration is performed on
> them. The cursor already has the state to keep the dataset so probably
> only the code would require change, not so much the data structures.
> 
> How do we make the users choose their nrecs? I think the cursor should
> have an attribute with a sensible default: 100? 1000? 1024? What
> attribute name?
[snip]
> Comments?

I mostly agree. I'd like to see a .fetchsize parameter both on the
module, on the connection and on the cursor (usual psycopg cascade
initialization). A good default value, supposing rows of 10-20 columns
with a mix of textual and numeric data, is ~2000 because it will
probably fetch less than 100KB of data and that seems a good compromise
between the number of rows transferred and data you'll throw away if you
decide to leave the loop early.

federico

-- 
Federico Di Gregorio                         federico(dot)digregorio(at)dndg(dot)it
Studio Associato Di Nunzio e Di Gregorio                  http://dndg.it
 If nobody understand you, that doesn't mean you're an artist.
                                                            -- anonymous

In response to

Responses

psycopg by date

Next:From: Daniele VarrazzoDate: 2011-01-13 17:06:19
Subject: Please help fixing a couple of zope-related issues
Previous:From: Daniele VarrazzoDate: 2011-01-13 16:37:47
Subject: Proposal: efficient iter on named cursors

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group