Re: pg_dump -s dumps data?!

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: depesz(at)depesz(dot)com
Cc: Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump -s dumps data?!
Date: 2012-01-31 04:18:31
Message-ID: 24757.1327983511@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

[ Note: please follow-up to pgsql-hackers not pgsql-general; I think
this discussion needs to move there ]

hubert depesz lubaczewski <depesz(at)depesz(dot)com> writes:
> On Mon, Jan 30, 2012 at 11:30:51AM -0500, Tom Lane wrote:
>> That is way too vague for my taste, as you have not shown the pg_dump
>> options you're using, for example.

> i tried to explain that the options don't matter, but here we go. full
> example:
> [ example showing pg_dump's odd behavior for extension config tables ]

[ traces through that with gdb... ]

As I suspected, the behavioral change from 9.1 to HEAD is not
intentional. It is an artifact of commit
7b070e896ca835318c90b02c830a5c4844413b64, which is almost, but not
quite, entirely broken. I won't enumerate its shortcomings here,
because they're not really relevant, but it does seem appropriate to
discuss exactly what we think *should* happen for tables created inside
extensions.

The original design intention for non-config tables was, per the manual:

Ordinarily, if a table is part of an extension, neither the
table's definition nor its content will be dumped by pg_dump.

the assumption being that both the definition and the content would be
re-loaded by executing the extension's SQL script. The purpose of
pg_extension_config_dump() is to allow part or all of the data in the
table to be treated as user data and thus dumped; this is assumed to
be data not supplied by the extension script but by subsequent user
insertions.

I don't recall that we thought very hard about what should happen when
pg_dump switches are used to produce a selective dump, but ISTM
reasonable that if it's "user data" then it should be dumped only if
data in a regular user table would be. So I agree it's pretty broken
that "pg_dump -t foo" will dump data belonging to a config table not
selected by the -t switch. I think this should be changed in both HEAD
and 9.1 (note that HEAD will presumably return to the 9.1 behavior once
that --exclude-table-data patch gets fixed).

What's not apparent to me is whether there's an argument for doing more
than that. It strikes me that the current design is not very friendly
towards the idea of an extension that creates a table that's meant
solely to hold user data --- you'd have to mark it as "config" which
seems a bit unfortunate terminology for that case. Is it important to
do something about that, and if so what?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-01-31 04:29:44 Re: [GENERAL] Why extract( ... from timestamp ) is not immutable?
Previous Message Nick 2012-01-31 03:55:33 Help speeding up a left join aggregate

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-01-31 04:29:44 Re: [GENERAL] Why extract( ... from timestamp ) is not immutable?
Previous Message Ashutosh Bapat 2012-01-31 03:56:47 Re: Confusing EXPLAIN output in case of inherited tables