Quick Links

Re: pg_dump --split patch

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	dmitry(at)koterov(dot)ru, Joel Jacobson <joel(at)gluefinance(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, David Wilson <david(dot)t(dot)wilson(at)gmail(dot)com>
Subject:	Re: pg_dump --split patch
Date:	2011-01-23 02:04:33
Message-ID:	AANLkTi=5jojXSPsHduPJGiQk4=c4uyVHS2vZ-yk3itGr@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Jan 3, 2011 at 2:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Jan 3, 2011 at 1:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Yeah, that's exactly it. I can think of some possible uses for
>>> splitting up pg_dump output, but frankly "to ease diff-ing" is not
>>> one of them. For that problem, it's nothing but a crude kluge that
>>> only sort-of helps. If we're to get anywhere on this, we need a
>>> better-defined problem statement that everyone can agree is worth
>>> solving and is well solved with this particular approach.
>
>> I have to admit I'm a bit unsold on the approach as well. It seems
>> like you could write a short Perl script which would transform a text
>> format dump into the proposed format pretty easily, and if you did
>> that and published the script, then the next poor shmuck who had the
>> same problem could either use the script as-is or hack it up to meet
>> some slightly different set of requirements. Or maybe you'd be better
>> off basing such a script on the custom or tar format instead, in order
>> to avoid the problem of misidentifying a line beginning with --- as a
>> comment when it's really part of a data item. Or maybe even writing a
>> whole "schema diff" tool that would take two custom-format dumps as
>> inputs.
>
>> On the other hand, I can certainly think of times when even a pretty
>> dumb implementation of this would have saved me some time.
>
> The basic objection that I have to this patch is that it proposes to
> institutionalize a pretty dumb implementation. And, as you mentioned,
> once it's in there it'll be more or less set in stone because we aren't
> going to want to support umpteen variants.
>
> I like the idea of a postprocessing script a lot better --- it seems
> like it wouldn't get in the way of people making their own variants.
> And as you say it'd likely be pretty trivial to do.

I notice that this patch is marked as "Needs Review" in the CommitFest
application, but I think it's fair to say that there's no consensus to
commit something along these lines. Accordingly, I'm going to mark it
"Returned with Feedback". There is clearly a need for better tooling
in this area, but I think there's a great deal of legitimate doubt
about whether this is the right solution to that problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: pg_dump --split patch at 2011-01-03 19:18:23 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2011-01-23 02:08:22	Re: auto-sizing wal_buffers
Previous Message	Robert Haas	2011-01-23 01:57:01	Re: READ ONLY fixes