Re: parallel pg_restore - WIP patch

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject: Re: parallel pg_restore - WIP patch
Date: 2008-10-03 08:51:27
Message-ID: 48E5DD0F.5070700@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan wrote:
>
>
> Stefan Kaltenbrunner wrote:
>> Tom Lane wrote:
>>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>> Tom Lane wrote:
>>>>> Um, FKs could conflict with each other too, so that by itself isn't
>>>>> gonna fix anything.
>>>
>>>> Good point. Looks like we'll need to make a list of "can't run in
>>>> parallel with" items as well as strict dependencies.
>>>
>>> Yeah, I was just thinking about that. The current archive format
>>> doesn't really carry enough information for this. I think there
>>> are two basic solutions we could adopt:
>>>
>>> * Extend the archive format to provide some indication that "restoring
>>> this object requires exclusive access to these dependencies".
>>>
>>> * Hardwire knowledge into pg_restore that certain types of objects
>>> require exclusive access to their dependencies.
>>>
>>> The former seems more flexible, as well as more in tune with the basic
>>> design assumption that pg_restore shouldn't have a lot of knowledge
>>> about individual archive object types. But it would mean that you
>>> couldn't use parallel restore with any pre-8.4 dumps. In the long run
>>> that's no big deal, but in the short run it's annoying.
>>
>> hmm not sure how much of a problem that really is - we usually
>> recommend to use the pg_dump version of the target database anyway.
>>
>>
>>
>>
>
> We don't really need a huge amount of hardwiring as it turns out. Here
> is a version of the patch that tries to do what's needed in this area.

this one is much better - however I still seem to be able to create
deadlock scenarios with strange FK relations - ie FKs going in both
directions between two tables.

for those interested these are the timings on my 8 core testbox for my
test database:

single process restore: 169min
-m2: 101min
-m6: 64min
-m8: 63min
-m16: 56min

Stefan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-10-03 09:31:07 Re: pgsql: Add relation fork support to pg_relation_size() function.
Previous Message Peter Eisentraut 2008-10-03 08:24:38 numeric_big test