Re: pg_dump --split patch

From: Joel Jacobson <joel(at)gluefinance(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, David Wilson <david(dot)t(dot)wilson(at)gmail(dot)com>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_dump --split patch
Date: 2010-12-29 07:27:34
Message-ID: AANLkTikRn63rn2LiVJGeFXKZ9myWMiPJj5jjZP7A2=Lg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/12/29 Andrew Dunstan <andrew(at)dunslane(dot)net>

> try:
>
> diff -F '^CREATE' ...
>
> cheers
>
> andrew
>

Embarrasing, I'm sure I've done `man diff` before, must have missed that
one, wish I'd known about that feature before, would have saved me many
hours! :-) Thanks for the tip!

There are some other real-life use-cases where I think splitting would be
nice and save a lot of time:

a) if you don't have a perfect 1:1 relationship between all the SPs in your
database and your source code repository (handled by your favorite version
control system), i.e. if you suspect some SPs in the database might differ
compared to the source code files in your repo. In this scenario, it might
be simpler to "start over" and continue developing on a repo built from a
pg_dump --split export. You would lose all history, but it might still be
worth it if the "compare everything in database against source code files in
repo"-project would take a lot of man hours.

b) quick branching - perhaps you are a consultant at a company where they
don't even have the SPs stored in separate files, they might have been
magically installed by some consultant before you without any trace. :-) To
get up to speed solving the problem you've been assigned, which in this
example involves a lot of SP coding and modifications of existing functions,
it would save a lot of time if you had all functions in separate files
before you started coding, then you would use git or any other nice version
control system to track your changes and figure out what you've done once
you get everything to work.

c) automatically saving daily snapshots of your production database schema
to your version control system. While the best version control system (git)
does not track individual files, many of the ancient ones still very popular
ones like svn do so. If every function in the production database schema
would be saved automatically to the VCS, you would be guaranteed to have a
tack of all deployed changes affecting each function, which is probably a
lot fewer changes compared to the entire history for each function, assuming
developers commit things while developing and not only when deploying.

d) while pg_dump offers some options to limit the output content, such as -s
for "schema only" and -t/-T to limit which tables to dump, it lacks options
to export "functions only" or "these functions only". It would require quite
a lot of such options to provide the same flexibility as a split dump,
highly reducing the need for such options as you could then compose your own
restore script based on the dump.

Of course, not all of these scenarios are relevant for everybody.

--
Best regards,

Joel Jacobson
Glue Finance

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2010-12-29 08:16:29 Re: "writable CTEs"
Previous Message Robert Haas 2010-12-29 04:54:46 Re: and it's not a bunny rabbit, either