Re: parallel pg_restore

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Joshua Drake <jd(at)commandprompt(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: parallel pg_restore
Date: 2008-09-23 20:50:43
Message-ID: 48D956A3.5030204@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> On Tue, 2008-09-23 at 12:43 -0700, Joshua Drake wrote:
>
>> On Tue, 23 Sep 2008 08:44:19 +0100
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>>
>>
>>> On Mon, 2008-09-22 at 15:05 -0400, Andrew Dunstan wrote:
>>>
>>>
>>>> j and m happen to be two of those that are available.
>>>>
>>>> I honestly don't have a terribly strong opinion about what it
>>>> should be called. I can live with jobs or multi-threads.
>>>>
>>> Perhaps we can use -j for jobs and -m for memory, so we can set memory
>>> available across all threads with a single total value.
>>>
>>> I can live with jobs or multi-threads also, whichever we decide.
>>> Neither one is confusing to explain.
>>>
>>>
>> Memory? Where did that come from. Andrew is that in your spec?
>>
>
> No, but it's in mine. As I said upthread, no point in making it more
> parallel than memory allows. Different operations need more/less memory
> than others, so we must think about that also. We can quickly work out
> how big a table is, so we can work out how much memory it will need to
> perform sorts for index builds and thus how many parallel builds can
> sensibly take place.
>
>

If that ever happens it will certainly not be in this go round.

In fact, we have some anecdotal evidence that the point of dimishing
returns is not reached until a fairly high degree of parallelism is used
(Joshua's and my client has been using 24 threads, I believe).

In any case, my agenda goes something like this:

* get it working with a basic selection algorithm on Unix (nearly
done - keep your eyes open for a patch soon)
* start testing
* get it working on Windows
* improve the selection algorithm
* harden code

If we get all that done by November we'll have done well. And we know
that in some cases just this much can lead to reductions in restore time
of the order of 80%.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-09-23 20:50:47 Re: EXEC_BACKEND
Previous Message Bruce Momjian 2008-09-23 20:43:12 Re: EXEC_BACKEND