Re: pg_dump and dependencies and --section ... it's a mess

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: pg_dump and dependencies and --section ... it's a mess
Date: 2012-06-21 22:18:27
Message-ID: 4FE39DB3.1010005@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/21/2012 02:13 PM, Tom Lane wrote:
> Don't know if everybody on this list has been paying attention to the
> pgsql-bugs thread about bug #6699. The shortest example of the problem
> is
>
> create table t1 (f1 int primary key, f2 text);
> create view v1 as select f1, f2 from t1 group by f1;
>
> The view's query is legal only because of the primary key on f1,
> else the reference to f2 would be considered inadequately grouped.
> So when we create the view we mark it as dependent on that primary key
> (or more accurately, the view's _RETURN rule is so marked). This all
> works fine so far as the backend is concerned: you can't drop the PK
> without dropping or redefining the view. But it gives pg_dump
> indigestion. If you do "pg_dump -Fc | pg_restore -l -v" you see
> (relevant entries only):
>
> 5; 2615 2200 SCHEMA - public postgres PRE_DATA
> 168; 1259 41968 TABLE public t1 postgres PRE_DATA
> ; depends on: 5
> 1921; 2606 41975 CONSTRAINT public t1_pkey postgres POST_DATA
> ; depends on: 168 168
> 169; 1259 41976 VIEW public v1 postgres PRE_DATA
> ; depends on: 1919 5
> 1922; 0 41968 TABLE DATA public t1 postgres DATA
> ; depends on: 168
>
> Now, there are two not-nice things about this. First, the view is shown
> as depending on "object 1919", which is pg_dump's internal DumpId for
> the view's _RETURN rule --- but that's nowhere to be seen in the dump,
> so the fact that it's dependent on the t1_pkey constraint is not
> visible in the dump. This puts parallel pg_restore at risk of doing
> the wrong thing. Second, because view definitions are preferentially
> dumped in the PRE_DATA section, the t1_pkey constraint has been hoisted
> up to come before the table data. That means that in an ordinary serial
> restore, the index will be created before the table's data is loaded,
> which is undesirable for efficiency reasons.
>
> It gets worse though. I've labeled the above archive items with their
> "SECTION" codes (which for some reason pg_restore -l -v doesn't print)
> so that you can see that we've got a POST_DATA item that must come
> before a PRE_DATA item. This wreaks havoc with the entire concept
> of splitting dump files into three sections. When I first realized
> that, I was thinking that we would have to revert the --section flag
> we added to pg_dump/pg_restore in 9.2, for lack of any way to guarantee
> that the items can actually be restored if they are split according
> to sections.
>
> I think that we can fix it though. There are provisions in pg_dump
> already for splitting a view apart from its _RETURN rule --- and if the
> rule comes out as a separate object, it's POST_DATA. So at least in
> principle, we could fix things by dumping this scenario this way:
>
> SCHEMA public PRE_DATA
> TABLE t1 PRE_DATA
> TABLE v1 PRE_DATA (at this point it's a table not a view)
> TABLE DATA t1 DATA
> CONSTRAINT t1_pkey POST_DATA
> RULE for v1 POST_DATA (adding the rule turns v1 into a view)
>
> The problem is how to persuade pg_dump to do that; as the code stands,
> it will only break a view apart from its rule if that's necessary to
> break a circular dependency loop, and there is none in this example.
>
> Another point worth thinking about is that if --section is to be trusted
> at all, we need some way of guaranteeing that the dependency-sorting
> code won't happily create other similar cases. There is nothing in
> there right now that would think anything is wrong with an ordering
> that breaks the section division.
>
> I believe the right fix for both of these issues is to add knowledge of
> the section concept to the topological sort logic, so that an ordering
> that puts POST_DATA before DATA or PRE_DATA after DATA is considered to
> be a dependency-ordering violation. One way to do that is to add dummy
> "fencepost" objects to the sort, representing the start and end of the
> DATA section. However, these objects would need explicit dependency
> links to every other DumpableObject, so that doesn't sound very good
> from a performance standpoint. What I'm going to go look at is whether
> we can mark DumpableObjects with their SECTION codes at creation time
> (rather than adding that information at ArchiveEntry() time) and then
> have the topo sort logic take that marking into account in addition to
> the explicit dependency links.
>
> Assuming we can make that work, how far should it be back-patched?
> The --section issue is new in 9.2, but this would also take care of
> the restore problems for views with constraint dependencies, which
> are allowed as of 9.1, so I'm thinking we'd better put it into 9.1.
>
> The other problem is the bogus dependency IDs that pg_dump emits by
> virtue of not caring whether the dependency links to an object it's
> actually dumped. I posted a patch for that in the pgsql-bugs thread
> but pointed out that it would not work before 9.2 without additional
> surgery. If we fix the view vs constraint issue by adding section
> knowledge to the sort, I think we can maybe get away with not fixing
> the dependency IDs in back branches. parallel pg_restore is definitely
> at risk without the fix, but in the absence of any other known problem
> cases I'm inclined to not change the back branches more than we have to.
>
> Thoughts?
>
>

Wow, talk about subtle. When I first read this my face turned a bit red,
but after reading it a couple more times I don't feel quite so guilty,
especially since the constraint dependency didn't exist at the time we
did parallel pg_restore.

If I'm understanding you correctly, fixing the bogus dependency thing is
more an insurance policy than fixing a case (other than the constraint
dependency) that is known to be broken. If that is correct, and given
that we've had 3 years of parallel pg_restore and not heard of such
issues, it doesn't seem unreasonable not to backpatch.

(There's another bug to do with parallel pg_restore and clustering that
Andrew Hammond raised back in January, that I want to fix when I get
some time.)

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-06-21 22:23:57 Re: [PATCH 04/16] Add embedded list interface (header only)
Previous Message Simon Riggs 2012-06-21 21:53:57 Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)