Re: Feature: give pg_dump a WHERE clause expression

From: daveg <daveg(at)sonic(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Davy Durham <pubaddr5(at)davyandbeth(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Feature: give pg_dump a WHERE clause expression
Date: 2008-06-02 06:53:38
Message-ID: 20080602065338.GY2470@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Sun, Jun 01, 2008 at 08:50:13PM -0400, Stephen Frost wrote:
> * daveg (daveg(at)sonic(dot)net) wrote:
> > The feature that the proposed patch enables is to create pg_dump custom
> > format archives for multiple tables with a predicate. No amount of csv or
> > xml will do that. Contrived example:
>
> Uh, pg_dump's custom format really isn't particularly special, to be
> honest. Is there some reason you're interested in using it over, as was
> suggested, COPY/CSV/etc format? You could, of course, gzip the output
> of COPY (as pg_dump does) if you're concerned about space..

You really underrate the power and utility of the pg_dump, pg_restore
combination. These are very useful tools. I sense that they are sometimes
considered ugly stepchildren, and in need of comprehensive reformation,
but that glorious future should not stand in the way of buttering todays
bread.

Use case: I have a client with 50+ databases on 40+ hosts with more coming
online all the time, they have a couple dozen tables (varies over time) on
each db that need to have summaries of daily activity posted to a central
database. These also need to be backed up for offline archiving. The
archives are sometimes selectively re-loaded to diagnose or verify events
in the past.

Of course something can always be written to do most of this, or Skytools or
even slony might be useful for the posting part, but the proposed pg_dump
feature pretty much handles the whole thing in a very nice way that will
require about 10 lines of shell script and will work unchanged even as
tables and databases are added and dropped from the set. The operations
staff at the site are familiar with pg_dump and selective restores using
the list options -l/-L etc, so it would very be easy for them to use. It
is also an advantage that the pg_dump archives contain a complete set of
related data in one container that can be loaded or selectively loaded in
one command instead of picking through thousands of csv or copy files.
Also pg_dump also handles issues like permissions and ownership and indexes
and constraints which a csv file does not.

The argument that one could just use copy and tar and gzip applies to the
whole existance of pg_dump itself. pg_dump is just a more convenient way to
copy out tables and as such does not NEED to exist at all. However it is
convenient to that is does exist and does have convenient features to
selectively handle schemas and tables.

What is being suggested is that pg_dump take advantage of a new feature
added to postgresql (copy with where clause) to provide even more convenience
and power than it already has.

If the patch is intrusive, or ugly, or interferes with future plans, or is
likely to confuse people, then certainly it should not go in. But it seems
trivial in terms of code and complexity and does no other harm.

Merely because one do not have a use case at present does not mean that use
cases don't exist. The patch submitter (Davy) seems to have one, and I (dg)
have at least two separate client sites that could use this immediately.
Oddly, or perhaps predictably, Davy's use case is quite different from mine,
and has no appeal to me at all, but the one feature combined with the
existing abilities of pg_dump covers both.

I'll stop now, this is not that big a deal, but it seems like a
useful idea and worth more consideration than it has received.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2008-06-02 07:52:51 Re: extend VacAttrStats to allow stavalues of different types
Previous Message Davy Durham 2008-06-02 05:38:04 Feature: pg_dump: ability to specify WHERE clause expression for -t/--table option