Quick Links

Re: pg_dump test instability

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: pg_dump test instability
Date:	2018-08-28 18:47:17
Message-ID:	1765.1535482037@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Parallel *restore* can be done from either a custom-format dump or from
> a directory-format dump. I agree that we should seperate the concerns
> and perform independent sorting on the restore side of things based on
> the relative sizes of tables in the dump (be it custom format or
> directory format). While compression might make us not exactly correct
> on the restore side, I expect that we'll generally be close enough to
> avoid most cases where a single worker gets stuck working on a large
> table at the end after all the other work is done.

Here's a proposed patch for this. It removes the hacking of the TOC list
order, solving Peter's original problem, and instead sorts-by-size
in the actual parallel dump or restore control code. There are a
number of ensuing performance benefits:

* The BLOBS entry, if any, gets to participate in the ordering decision
during parallel dumps. As the code stands, all the effort to avoid
scheduling a long job last is utterly wasted if you've got a lot of
blobs, because that entry stayed at the end. I didn't work real hard
on that, just gave it a large size so it would go first not last. If
you just have a few blobs, that's not necessary, but I doubt it hurts
either.

* During restore, we insert actual size numbers into the BLOBS and
TABLE DATA items, and then anything that depends on a TABLE DATA item
inherits its size. This results in size-based prioritization not just
for simple indexes as before, but also for constraint indexes (UNIQUE
or PRIMARY KEY), foreign key verifications, delayed CHECK constraints,
etc. It also means that stuff like triggers and rules get reinstalled
in size-based order, which doesn't really help, but again I don't
think it hurts.

* Parallel restore scheduling by size works for custom dumps as well
as directory ones (as long as the dump file was seekable when created,
but you'll be hurting anyway if it wasn't).

I have not really tried to demonstrate performance benefits, because
the results would depend a whole lot on your test case; but at least
in principle this should result in far more intelligent scheduling
of parallel restores.

While I haven't done so here, I'm rather tempted to rename the
par_prev/par_next fields and par_list_xxx functions to pending_prev,
pending_next, pending_list_xxx, since they now have only one use.
(BTW, I tried really hard to get rid of par_prev/par_next altogether,
in favor of keeping the pending entries in the unused space in the
"ready" TocEntry* array. But it didn't work out well --- seems like
a list really is the natural data structure for that.)

regards, tom lane

Attachment	Content-Type	Size
smarter-parallel-dump-restore-1.patch	text/x-diff	43.7 KB

In response to

Re: pg_dump test instability at 2018-08-27 15:59:44 from Stephen Frost

Responses

Re: pg_dump test instability at 2018-09-12 10:20:06 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fabien COELHO	2018-08-28 18:54:02	Re: csv format for psql
Previous Message	Jeremy Finzel	2018-08-28 17:41:54	Re: Some pgq table rewrite incompatibility with logical decoding?