Re: reducing IO and memory usage: sending the content of a table to multiple files

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: reducing IO and memory usage: sending the content of a table to multiple files
Date: 2009-04-03 01:05:19
Message-ID: 20090403010519.GT12225@frubble.xen.chris-lamb.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Apr 02, 2009 at 09:48:33PM +0200, Ivan Sergio Borgonovo wrote:
> I didn't find any elegant example of cursor use in PHP... OK PHP is
> not the most elegant language around... but still any good exapmle
> someone could point me at?

I don't program PHP; but my guess would be something like:

pg_query("BEGIN;");
pg_query("DECLARE cur CURSOR FOR SELECT * FROM t1;");
while (pg_num_rows($result = pg_query("FETCH 1000 FROM cur;")) > 0) {
while($row = pg_fetch_array($result)) {
}
}
pg_query("COMMIT;");

You can obviously increase the "FETCH" upwards and if you're feeling
fancy you could even run the FETCH async from the code that processes
the results. Maybe something like:

pg_query($conn, "BEGIN;");
pg_query($conn, "DECLARE cur CURSOR FOR SELECT * FROM t1;");
pg_send_query($conn, "FETCH 1000 FROM cur;");
while(1) {
$result = pg_get_result($conn);
pg_send_query($conn, "FETCH 1000 FROM cur;");
if (pg_num_rows($result) == 0)
break;
while($row = pg_fetch_array($conn, $result)) {
}
if (pg_get_result($conn)) {
// badness, only expecting a single result
}
}

Note, I've never tried to do PG database stuff from PHP, let alone stuff
like this so it may be all wrong! AFAICT, there's no need to bother
with pg_connection_busy because the call to pg_get_result will block
until the results come back from the database.

> So I think the largest cost of the operation will be IO.
> \copy should be optimised for "raw" data output, but maybe all its
> advantages get lost once I've to use pipes and adding complexity to
> filtering.

Streaming IO is pretty fast, I think you'll be hard pushed to keep up
with it from PHP and you'll end up CPU bound in no time. Be interesting
to find out though.

> I was reading about all the php documents and trying to understand
> how buffers and memory usage works, so I gave a look to MySQL
> documents too...

Not sure about PG, but the C api pretty much always buffers everything
in memory first. There was mention of getting control of this, but I've
got no idea where it got.

--
Sam http://samason.me.uk/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Abbas 2009-04-03 02:11:38 Re: slow select in big table
Previous Message Sam Mason 2009-04-03 00:33:56 Re: [HACKERS] string_to_array with empty input