Parallel backups (was Re: why postgresql over other RDBMS)

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Parallel backups (was Re: why postgresql over other RDBMS)
Date: 2007-05-24 23:33:12
Message-ID: 465620B8.2040107@cox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/24/07 17:21, Chris Browne wrote:
[snip]
>
> This would permit doing a neat parallel decomposition of pg_dump: you
> could do a 4-way parallelization of it that would function something
> like the following:
>
> - connection 1 opens, establishes the usual serialized mode transaction
>
> - connection 1 dumps the table metadata into one or more files in a
> specified directory
>
> - then it forks 3 more connections, and seeds them with the same
> serialized mode state
>
> - it then goes thru and can dump 4 tables concurrently at a time,
> one apiece to a file in the directory.
>
> This could considerably improve speed of dumps, possibly of restores,
> too.

What about a master thread that "establishes the usual serialized
mode transaction" and then issues N asynchronous requests to the
database, and as they return with data, pipe the data to N number of
corresponding "writer" threads. Matching N to the number of tape
drives comes to mind.

Yes, the master thread would be the choke point, but CPUs and RAM
are still a heck of a lot faster than disks, so maybe it wouldn't be
such a problem after all.

Of course, if libpq(??) doesn't handle async IO, then it's not such
a good idea after all.

> Note that this isn't related to subtransactions...

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGViC4S9HxQb37XmcRAgkAAKC4pyZQWDF01S17uITbOkcj+KY8lgCg40pi
2B3xg2tnp554GGP0VsgACWE=
=eIUP
-----END PGP SIGNATURE-----

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raymond C. Rodgers 2007-05-24 23:44:53 Inheritance question
Previous Message George Pavlov 2007-05-24 23:15:54 index vs. seq scan choice?