Parallel pg_restore versus dependencies

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Parallel pg_restore versus dependencies
Date: 2010-08-20 18:15:10
Message-ID: 23787.1282328110@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been poking into bug #5626,
http://archives.postgresql.org/pgsql-bugs/2010-08/msg00291.php

What's basically going on here is:
1. User tried to suppress the public schema from the restore list.
2. Since almost everything in the dump depends on the public schema,
pg_restore skipped over most of it looking for something it could
restore. It soon hit a TABLE DATA item from another schema, triggering
the switch into actual parallel restore mode.
3. Eventually, it found the public schema, which SortTocFromFile had
pushed to the end of the TOC list. At that point it recognized that
it shouldn't actually emit the item, so it didn't, but it did mark
the dependencies satisfied.
4. Now the floodgates are open to try to restore all the DDL items
in the public schema. But we're trying to do it in parallel.
Because pg_dump is exceedingly cavalier about marking DDL items with
their full dependencies, things soon go pear-shaped: in the reported
bug, we tried to do two interdependent DDL ops in parallel, and when
trying to duplicate the bug using the regression database, I
consistently got failures from restoring a view that depended on
not-yet-restored functions.

It'd probably be nice if the dependency data were more complete for
DDL items, but getting that right is a long-term project, and in
any case pg_restore can't really rely on it to be there in existing
dump files. Right now that data is only really trustable for
SECTION_DATA and SECTION_POST_DATA items, and we have to rely on
the dump ordering for PRE_DATA items.

I think we can patch it up for now by doing two things:

* Tweak SortTocFromFile so that items not-to-be-restored end up at the
*head* of the re-ordered TOC list, not the tail. This won't actually
make any difference in net runtime, but what it will do is ensure that
we scan those items and mark their dependencies as satisfied before
anything starts to happen for real. Thus omitting an item won't
result in unexpected departures from the commanded restore order.

* In restore_toc_entries_parallel, don't exit the serial restore mode
and start parallel restoring until we reach a TOC item that is both
DATA/POST_DATA *and* marked to be restored. This will prevent any
not-to-be-restored DATA/POST_DATA items at the list head from triggering
a premature switch into parallel restore mode.

It will still be the case that you can break it with an unwise choice of
restore order from a -L file, but at least it won't fail because of
hidden implementation behaviors.

In HEAD and perhaps 9.0, we could make things more robust by only
putting DATA/POST_DATA items into the parallel-restore lists in the
first place, and forcing all PRE_DATA items to be done in the initial
serial restore loop. However this would amount to ignoring the
commanded -L order to a greater extent than strictly necessary.
I'm not entirely sure if that's a good idea or not. Should we try to
honor the -L order even when it's not very safe?

Comments?

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2010-08-20 18:28:50 Re: [Glue] Deadlock bug
Previous Message David E. Wheeler 2010-08-20 18:12:56 Version Numbering