Skip site navigation (1) Skip section navigation (2)

Re: parallel restore

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel restore
Date: 2009-02-24 14:20:18
Message-ID: 49A40222.8050907@dunslane.net (view raw or flat)
Thread:
Lists: pgsql-hackers

I wrote:
>
>
> Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>  
>>> Tom Lane wrote:
>>>    
>>>> There is an unfinished TODO item here: we really ought to make it work
>>>> for tar-format archives.  That's probably not hugely difficult, but
>>>> I didn't look into it, and don't think we should hold up applying the
>>>> existing patch for it.
>>>>       
>>
>>  
>>> Right. Were you thinking this should be done for 8.4?
>>>     
>>
>> If you have time to look into it, sure.  Otherwise we should just put it
>> on the TODO list.
>>
>>            
>>   
>
> I've had a look at this. If our tar code supported out of order 
> restoration(using fseeko) I'd be done. But it doesn't, and I won't get 
> that done for 8.4, if at all. I'm not sure what would be involved in 
> making it work.
>
>

OK, I've spent some more time on this. pg_dump when writing a custom 
format file writes out the header and table of contents and then the 
data members, keeping track of where each one starts. If the output is 
seekable (as it usually is) it then rewrites the table of contents, this 
time including the data member offsets. Parallel restore requires that 
this offset info be available, and if the pg_dump output file was not 
seekable by pg_dump (e.g. if it was a pipe) then it will be unsuitable 
for use with parallel restore, which will fail.

In the case of tar output, pg_dump doesn't make any effort to keep the 
offset info at all, so parallel restore is not currently suitable for 
use with tar output, regardless of whether or not the pg_dump output was 
seekable.

I think we could cure both of these cases by having pg_dump write out a 
second copy of the table of contents, including data member offsets, at 
the end of the archive. Or it might just be a table of <data-member-id, 
offset> pairs if we're worried about space. In the latter case we'd need 
to go back and fix up the TOC, but that would be fairly simple. Either 
way I think we'd need to bump the archive version number so we'd know 
when to expect this.

Once we have that the custom format code should fail on this no matter 
how the dump was made, and parallel restore should work with tar format 
once we add code to it to seek for data members.

I think all of this can wait to 8.5, except that we should possibly 
document a bit more completely the current limitations on parallel restore.

(I was initially tempted to say we'd need compression of individual data 
members in tar format to do this sanely, but since the 
offsets-at-the-end suggestion should work even when pg_dump is 
outputting to a pipe, we'd still be able to send the output through gzip 
and so get a conventional .tgz file.)

cheers

andrew

In response to

Responses

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2009-02-24 14:24:30
Subject: Re: parallel restore
Previous:From: Heikki LinnakangasDate: 2009-02-24 14:00:21
Subject: Re: Significance of the magic number of btree pages..?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group