Re: Parallel pg_restore versus old dump files

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Igor Neyman <ineyman(at)perceptron(dot)com>
Subject: Re: Parallel pg_restore versus old dump files
Date: 2010-06-23 01:26:47
Message-ID: 11349.1277256407@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> In short, parallel pg_restore is guaranteed to fail on any input file
>> made with a pre-8.4 pg_dump on Windows.

> IIRC, you can reproduce this on Unix too by sending the output of
> pg_dump into a pipe. So it's not uniquely a Windows problem.

Right. We need to be able to cope, albeit with degraded performance.

> As Greg suggests, the solution would be to have a second TOC at the end
> of the file with the offsets.

Uh, that doesn't fix anything: if you can't seek, a TOC at the end of
the file is useless. And the cases where the writer can't seek are
likely to be identically the ones where the reader can't seek, viz
pg_dump piped to pg_restore (perhaps with some other programs between).

>> Another possibility is to just remove the inside-the-loop error test
>> altogether: make it just skip till it finds the desired item, and only
>> throw an error if it hits EOF without finding it. In the case that
>> the error test is trying to catch, this would mean significantly more
>> work done before reporting the error, but do we really care? I'm
>> leaning to this solution because it would not require exporting state
>> from the parallel restore control logic.

> Would exporting a bit of state be so bad?

The threaded case seems a bit messy, and frankly I don't believe that
we'd be buying anything. The error case never actually occurs in the real
world, except perhaps on corrupted archive files, so why should we care
about performance for it?

> For now, yes. But in 9.1 we should write out a second TOC and teach
> pg_restore to look for it.

I don't think this is useful.

>> 4. Is there any value in back-porting the Windows FSEEKO support into
>> 8.3 and 8.2? Arguably, not writing the data offsets is a performance
>> bug. However a back-port won't do anything for people who are dumping
>> with less than the latest minor release of pg_dump, so doing this might
>> be largely wasted effort.

> I doubt it's worth it, but I could be persuaded otherwise.

I'm leaning in that direction too. Anybody who's doing a version
upgrade really ought to be using the newer pg_dump version anyway ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-06-23 01:47:03 Re: Parallel pg_restore versus old dump files
Previous Message Andrew Dunstan 2010-06-23 01:02:28 Re: Parallel pg_restore versus old dump files