Re: pg_dump.c

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump.c
Date: 2011-09-11 18:50:06
Message-ID: 10792.1315767006@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> One example of what I'd like to provide is something this:

> char * pg_get_create_sql(PGconn *conn, object oid, catalog_class
> oid, pretty boolean);

> Which would give you the sql to create an object, optionally pretty
> printing it.

I think the major problem with creating a decent API here is that
"the SQL to create an object" is only a small part ... almost a trivial
part ... of what pg_dump needs to know about it. It's also aware of
ownership, permissions, schema membership, dependencies, etc etc.
I'm not sure about a reasonable representation for all that.

In particular, I think that discovering a safe dump order for a selected
set of objects is a pretty key portion of pg_dump's functionality.
Do we really want to assume that that needn't be included in a
hypothetical library?

Other issues include:

* pg_dump's habit of assuming that the SQL is being generated to work
with a current server as target, even when dumping from a much older
server. It's not clear to me that other clients for a library would
want that behavior ... but catering to multiple output versions would
kick the complexity up by an order of magnitude.

* a lot of other peculiar things that pg_dump does in the name of
backwards compatibility or robustness of the output script, which again
aren't necessarily useful for other purposes. An example here is the
choice to treat tablespace of a table as a separate property that's
not specified in the base CREATE TABLE command, so that the script
doesn't fail completely if the target database hasn't got such a
tablespace.

* performance. Getting the data retail per-object, as the above API
implies, would utterly suck. You have to think a little more carefully
about the integration between the discovery phase and the output phase,
as in there has to be a good deal of it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2011-09-11 18:58:57 psql additions
Previous Message Andrew Dunstan 2011-09-11 16:18:40 Re: pg_dump.c