Quick Links

Re: Warn when parallel restoring a custom dump without data offsets

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	David Gilman <dgilman(at)gilslotd(dot)com>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, pryzby(at)telsasoft(dot)com
Subject:	Re: Warn when parallel restoring a custom dump without data offsets
Date:	2020-07-12 19:57:58
Message-ID:	2571310.1594583878@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

David Gilman <dgilman(at)gilslotd(dot)com> writes:
> On Thu, Jul 02, 2020 at 05:25:21PM -0400, Tom Lane wrote:
>> I guess I'm completely confused about the purpose of these patches.
>> Far from coping with the situation of an unseekable file, they appear
>> to change pg_restore so that it fails altogether if it can't seek
>> its input file. Why would we want to do this?

> I'm not sure where the "fails altogether if it can't seek" is.

I misread the patch, is where :-(

As penance, I spent some time studying this patchset, and have a few
comments:

1. The proposed doc change in 0001 seems out-of-date; isn't it adding a
warning about exactly the deficiency that the rest of the patch is
eliminating? Note that the preceding para already says that the input
has to be seekable, so that's covered. Maybe there is reason for
documenting that parallel restore will be slower if the archive was
written in a non-seekable way ... but that's not what this says.

2. It struck me that the patch is still pretty inefficient, in that
anytime it has to back up in an offset-less archive, it blindly rewinds
to dataStart and rescans everything. In the worst case that'd still be
O(N^2) work, and it's really not necessary, because once we've seen a
given data block we know where it is. We just have to remember that,
which seems easy enough. (Well, on Windows it's a bit trickier because
the state in question is shared across threads; but that's good, it might
save some work.)

3. Extending on #2, we actually don't need the rewind and retry logic
at all. If we are looking for a block we haven't already seen, and we
get to the end of the archive, it ain't there. (This is a bit less
obvious in the Windows case than otherwise, but I think it's still true,
given that the start state is either "all offsets known" or "no offsets
known". A particular thread might skip over some blocks on the strength
of an offset established by another thread, but the blocks ahead of that
spot must now all have known offsets.)

4. Patch 0002 seems mighty expensive for the amount of code coverage
it's adding. On my machine it seems to raise the overall runtime of
pg_dump's "make installcheck" by about 10%, and the only new coverage
is of the few lines added here. I wonder if we couldn't cover that
more cheaply by testing what happens when we use a "-L" option with
an intentionally mis-sorted restore list.

5. I'm inclined to reject 0003. It's not saving anything very meaningful,
and we'd just have to put the code back whenever somebody gets around
to making pg_backup_tar.c capable of out-of-order restores like
pg_backup_custom.c is now able to do.

The attached 0001 rewrites your 0001 as per the above ideas (dropping
the proposed doc change for now), and includes your 0004 for simplicity.
I'm including your 0002 verbatim just so the cfbot will be able to do a
meaningful test on 0001; but as stated, I don't really want to commit it.

regards, tom lane

Attachment	Content-Type	Size
0001-remember-seek-positions.patch	text/x-diff	4.6 KB
0002-Add-integration-test-for-out-of-order-TOC-requests-i.patch	text/x-diff	6.1 KB

In response to

Re: Warn when parallel restoring a custom dump without data offsets at 2020-07-08 03:19:35 from David Gilman

Responses

Re: Warn when parallel restoring a custom dump without data offsets at 2020-07-15 20:40:59 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Daniel Gustafsson	2020-07-12 20:04:21	Re: should INSERT SELECT use a BulkInsertState?
Previous Message	Daniel Gustafsson	2020-07-12 19:52:07	Re: explain HashAggregate to report bucket and memory stats