Producer/Consumer Issues in the COPY across network

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Producer/Consumer Issues in the COPY across network
Date: 2008-02-26 11:00:33
Message-ID: 1204023633.4252.225.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm looking at ways to reduce the number of network calls and/or the
waiting time while we perform network COPY.

The COPY calls in libpq allow asynchronous actions, yet are coded in a
synchronous manner in pg_dump, Slony and psql \copy.

Does anybody have any experience with running COPY in asynchronous
mode?

When we're running a COPY over a high latency link then network time is
going to become dominant, so potentially, running COPY asynchronously
might help performance for loads or initial Slony configuration. This is
potentially more important on Slony where we do both a PQgetCopyData()
and PQputCopyData() in a tight loop.

I also note that PQgetCopyData always returns just one row. Is there an
underlying buffering between the protocol (which always sends one
message per row) and libpq (which is one call per row)? It seems
possible for us to request a number of rows from the server up to a
preferred total transfer size.

PQputCopyData seems to be more efficient with smaller rows.

Ideas? Experience?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-02-26 11:06:02 Re: pg_dump additional options for performance
Previous Message Peter Eisentraut 2008-02-26 10:22:58 Re: [COMMITTERS] pgsql: Link postgres from all object files at once, to avoid the