Re: Parallel pg_dump for 9.1

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel pg_dump for 9.1
Date: 2010-03-29 17:36:38
Message-ID: 603c8f071003291036vf85b254n236143f6c3af257c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 29, 2010 at 1:16 PM, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> Robert Haas wrote:
>>
>> On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
>
> [...]
>>>
>>> - Regarding the output of pg_dump I am proposing two solutions. The
>>> first one is to introduce a new archive type "directory" where each
>>> table and each blob is a file in a directory, similar to the
>>> experimental "files" archive type. Also the idea has come up that you
>>> should be able to specify multiple directories in order to make use of
>>> several physical disk drives. Thinking this further, in order to
>>> manage all the mess that you can create with this, every file of the
>>> same backup needs to have a unique identifier and pg_restore should
>>> have a check parameter that tells you if your backup directory is in a
>>> sane and complete state (think about moving a file from one backup
>>> directory to another one or trying to restore from two directories
>>> which are from different backup sets...).
>>
>> I think that specifying several directories is a piece of complexity
>> that would be best left alone for a first version of this.  But a
>> single directory with multiple files sounds pretty reasonable.  Of
>> course we'll also need to support that format in non-parallel mode,
>> and in pg_restore.
>>
>>> The second solution to the single-file-problem is to generate no
>>> output at all, i.e. whatever you export from your source database you
>>> import directly into your target database, which in the end turns out
>>> to be a parallel form of "pg_dump | psql".
>>
>> This is a very interesting idea but you might want to get the other
>> thing merged first, as it's going to present a different set of
>> issues.
>
> I had some prior discussion with joachim (and I suspect I had some influence
> in him trying to implement that) on that.
> The reason why this is really needed is that the current pg_restore -j is
> actually a net loss(vs "pg_dump | psql") in a LOT of scenarios that are
> basically "duplicate this database to that location" (or any migration
> really).
> The example at had is a 240GB production database with around 850 tables, it
> takes ~145min to dump that database single threaded(completely CPU bound),
> simply loading the SQL using psql can restore it in ~150min(again CPU bound
> both for COPY and index creation), -j8 brings that down to ~55min.
> So if  you do the math(and a bit of handwaving):
>
> * using pg_dump | psql you get greatest(140,150) -> 150min.
> * using pg_dump -Z0 -Fc && pg_restore -j8 you get 145+55 -> 200min
> * using a theoretical parallel pg_dump and the existing parallel restore you
> would get: 50(just a guess for how fast it might be) + 55 -> 105min
> * a parallel dump & restore that can pipline would end up at
> greatest(50,55)->55min
>
>
> So a parallel dump alone would only give you a 50% speedup in total time for
> doing a migration/upgrade/dump-to-devbox despite the fact that it uses 8x
> the resources. A piplined solution would result in a ~3x speedup in total
> time and you don't even have to even think about stuff that might be a
> problem like having available diskspace on the source/destination to hold a
> full temporary dump(if you don't you might even have to add some transfer
> time as well).

It's a great idea - but there are two features here. I've seen many
patches implementing two features during my relatively short time with
the project and if the rejection rate hasn't been 100% it's certainly
been close. If Joachim thinks he's got it all working, by all means
submit both patches. One can apply over the other if they are
interdependent. But I STRONGLY suggest separating this into two
pieces - it is MUCH easier to get things applied that way, for good
and valid reasons.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-03-29 18:17:55 Re: enable_joinremoval
Previous Message Simon Riggs 2010-03-29 17:26:59 Re: enable_joinremoval