Re: pg_dump in 7.4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc: "Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump in 7.4
Date: 2002-11-13 13:33:43
Message-ID: 20875.1037194423@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au> writes:
> Has anyone given much thought to improving pg_dump's object order algorithm
> for 7.4? It seems that now we have dependencies, it should just be a matter
> of doing a breadth-first or depth-first search over the pg_depend table to
> generate a valid order of oids.

I've thought about this a little. Using the pg_depend data would
obviously give pg_dump a big leg up on the problem, but it's by no means
a trivial task even so. Some things to think about:

* We don't store dependencies for SQL functions to things mentioned in
the SQL function body. (Maybe we should, but we don't.) So there's
data missing in that case, and possibly other cases.

* pg_dump really ought to still work for pre-7.3 databases. What will
its fallback strategy be like when it hasn't got pg_depend?

* With ALTER TABLE, CREATE OR REPLACE FUNCTION, etc, it's not hard at
all to create situations with circular dependencies. Right now pg_dump
is guaranteed to produce an unusable dump in such cases, but our
ambition should be to make it work. That means pg_dump needs a strategy
for breaking circular dependency paths, by itself using ALTER when
needed to postpone a reference.

The thought that I'd been toying with is to build a list of inter-object
dependencies (using pg_depend if available, else fall back on pg_dump's
native wit, ie, the rather limited set of dependencies it already
understands). Then do a topological sort, preferring to maintain OID
order in cases where the object ordering is underspecified. When the
sort fails (ie, there's a circular dependency) then modify the set of
dumpable objects by breaking some object into two parts (a base
declaration and an ALTER command); this changes the dependencies too.
Repeat the sort and adjustment steps until the sort succeeds.

If you're not familiar with topological sorts, look at Knuth or your
favorite algorithms text. There are one or two instances of the method
in PG already (deadlock detection, for example).

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-11-13 13:36:20 Re: pg_dump in 7.4
Previous Message Zeugswetter Andreas SB SD 2002-11-13 13:23:11 Re: RC1?