Re: customizing pg_dump together with copy.c's DoCopy function

From: "Brian Mathis" <brian(dot)mathis(at)gmail(dot)com>
To: "lynnsettle(at)yahoo(dot)com" <lynnsettle(at)yahoo(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: customizing pg_dump together with copy.c's DoCopy function
Date: 2006-07-16 20:17:13
Message-ID: 183c528b0607161317x201547b4v8981f76ab3ac8c3f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

pg_dump by default dumps to STDOUT, which you should use in a pipeline to
perform any modifications. To me this seems pretty tricky, but should be
doable. Modifying pg_dump really strikes me as the wrong way to go about
it. Pipelines operate in memory, and should be very fast, depending on how
you write the filtering program. You would need to dump the data without
compression, then compress it coming out the other end (maybe split it up
too). Something like this:
pg_dump | myfilter | gzip | split --bytes=2000M - mydump.

Also, you can't expect to have speed if you have no disk space.
Reading/writing to the same disk will kill you. If you could set up some
temp space over NFS on the local network, that should gain you some speed.

On 11 Jul 2006 08:43:17 -0700, lynnsettle(at)yahoo(dot)com <lynnsettle(at)yahoo(dot)com>
wrote:
>
> > > Is it possible to compile-link together frontend pg_dump code with
> > > backend code from copy.c?
> >
> > No. Why do you think you need to modify pg_dump at all?
> >
>
> pg_dump and pg_restore provide important advantages for upgrading a
> customer's database on site:
>
> They are fast. I want to minimize downtime.
> They allow compression. I often will have relatively little free disk
> space to work with.
> My concept is "customized dump", drop database, create new schema
> database, "customized restore".
>
> My upgrade requires many schema and data content changes. I've tried
> using standard SQL statements in perl scripts to do all of it, but even
> with no indexes on inserts, later creating indexes for the lookup work,
> and every other optimization I know of, a 100gb database requires
> several days to turn our old database into a new one. I was hoping that
> I could modify the speedy pg_dump/pg_restore utilities to make these
> changes "on the fly". It gets tricky because I have to restore some of
> the data to different tables having varying schema and also change the
> table linking. But this is all doable as long as I can "massage" the
> SQL statements and data both when it goes into the dump file and when
> it is getting restored back out.
>
> Or am I trying to do the impossible?
> -Lynn
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Wes 2006-07-16 22:46:16 Lock changes with 8.1 - what's the right lock?
Previous Message Steve Atkins 2006-07-16 19:00:05 Postgresql and Oracle