Re: 8.3 / 8.2.6 restore comparison

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 8.3 / 8.2.6 restore comparison
Date: 2008-02-24 17:32:31
Message-ID: 5097.1203874351@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> At some point, I think we have to bite the bullet and find a way to use
> multiple CPUs for a single load. I don't have any good ideas or plans
> for that, but hopefully someone does.

As already mentioned upthread, we could do that today, with zero backend
changes, by making pg_restore drive multiple sessions. Now there are
scenarios where this wouldn't help too much --- eg, a database with only
one enormous table. We couldn't parallelize the loading of that table,
although we could parallelize creation of its indexes. But for an
example such as JD is complaining about, we ought to have no trouble
thrashing his disks into the ground ;-)

What you would need for this is an -Fc or -Ft dump, because a plain
script output from pg_dump doesn't carry any dependency information,
much less any index of where in the file different bits of data are.
Just armwaving, I envision a multiprocess pg_restore like this:

* one controller process that tracks the not-yet-loaded
TOC items in the dump
* N child processes that just transmit a selected TOC item
to a connected backend, and then ask the controller
what to do next

The controller would need to use the dependency information to avoid,
eg, handing out a CREATE INDEX command before the parent table was
created and loaded.

One issue is that this couldn't use "-1" single-transaction restoring,
since obviously each child would need its own transaction, and
furthermore would have to commit before going back to the controller
for more work (since dependent TOC items might well get loaded by
a different child later). That defeats a couple of optimizations that
Simon put in recently. The one for no XLOG during COPY is not too
hard to see how to re-enable, but I'm not sure what else there was.

Most likely, the bottleneck with this sort of thing would be multiple
parallel reads from the pg_dump archive file. Possibly the controller
process could be taught to schedule COPY and CREATE INDEX operations
so that not too many processes are trying to read lots of archive
data at the same time.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian G. Pflug 2008-02-24 17:38:19 Re: Behaviour of rows containg not-null domains in plpgsql
Previous Message Andrew Dunstan 2008-02-24 17:28:04 Re: Behaviour of rows containg not-null domains in plpgsql