From: | Vladimir Ryabtsev <greatvovan(at)gmail(dot)com> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | psycopg(at)postgresql(dot)org |
Subject: | Re: Memory |
Date: | 2024-12-21 21:52:40 |
Message-ID: | CAMqTPqktrwkxn_11wPXfJRWoe=KgLTXo7j9U3Xxj9UBPYaKf-Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | psycopg |
Now I see, the doc is already great.
Thanks for pointing out Adrian.
On Sat, Dec 21, 2024 at 8:34 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 12/21/24 02:45, Vladimir Ryabtsev wrote:
> > Hi community,
> >
> > I am reading a big dataset using code similar to this:
> >
> > query = '''
> > SELECT timestamp, data_source, tag, agg_value
> > FROM my_table
> > '''I
> > batch_size = 10_000_000
> >
> > with psycopg.connect(cs, cursor_factory=psycopg.ClientrCursor) as conn:
>
> FYI, ClientCursor.
>
>
> >
> > I looked the documentation, but did not find specifics related to
> > performance differences between Server and Client cursors.
> >
> > I am fine with ServerCursor, but I need to ask, is it by design that
> > with ClientCursor the result set is copied into memory despite
> > fetchmany() limit? ClientCursor is the default class, so may be worth
> > documenting the difference (sorry, if I missed that).
>
> Client side cursor
>
>
> https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#client-side-cursors
>
> "In such querying pattern, after a cursor sends a query to the server
> (usually calling execute()), the server replies transferring to the
> client the whole set of results requested, which is stored in the state
> of the same cursor and from where it can be read from Python code (using
> methods such as fetchone() and siblings)."
>
>
> https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.Cursor.fetchmany
>
> "fetchmany(size: int = 0) → list[+Row]
>
> Return the next size records from the current recordset.
>
> size default to self.arraysize if not specified.
>
> Return type:
>
> Sequence[Row], with Row defined by row_factory
>
> "
>
>
> Server side cursor
>
>
> https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#server-side-cursors
>
> "PostgreSQL has its own concept of cursor too (sometimes also called
> portal). When a database cursor is created, the query is not necessarily
> completely processed: the server might be able to produce results only
> as they are needed. Only the results requested are transmitted to the
> client: if the query result is very large but the client only needs the
> first few records it is possible to transmit only them.
>
> The downside is that the server needs to keep track of the partially
> processed results, so it uses more memory and resources on the server."
>
> >
> > Thank you.
> >
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Daniele Varrazzo | 2025-04-26 00:25:47 | Python 3.14, template-strings, and psycopg |
Previous Message | Adrian Klaver | 2024-12-21 16:34:51 | Re: Memory |