Re: disk backups

From: Martijn van Oosterhout <kleptog(at)cupid(dot)suninternet(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: disk backups
Date: 2000-06-30 15:15:58
Message-ID: 395CB9AE.7250D597@cupid.suninternet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:
>
> Martijn van Oosterhout <kleptog(at)cupid(dot)suninternet(dot)com> writes:
> > Is there a better way? Here pg_dumping the DB takes over half an hour
> > (mainly because pg_dump chews all available memory).
>
> pg_dump shouldn't be a performance hog if you are using the default
> COPY-based style of data export. I'd only expect memory problems
> if you are using INSERT-based export (-d or -D switch to pg_dump).
> For now, the answer is "don't do that" ... at least not on big tables.

Aha! Thanks for that! Last time I asked here nobody answered...
So it only happens with an INSERT based export, didn't know
that (though I can't see why there would be a difference...)

Yes, we are using -D, mainly because we've had "issues" with
the COPY based export, ie, it won't read the resulting file
back. Admittedly this was a while ago now and I havn't checked
since.

The data in question was a long text field containing (long)
snippets of HTML. Quotes, backslashes, tabs, newlines, etc,
the works. The insert style dump never had a problem...

> This could be fixed in either of two ways:
>
> 1. recode pg_dump to use DECLARE CURSOR and FETCH to grab table contents
> in reasonable-size chunks (instead of with an all-at-once SELECT);
>
> 2. add an API to libpq that allows a select result to be retrieved
> on-the-fly rather than accumulating it in libpq's memory.
>
> The second is more work but would be more widely useful.

I was thinking to write my own version of pg_dump that would
do that but also allow specifying of ordering constraint, ie,
clustering. Maybe it would be better to just switch to the
other output format...

>
> However, it's not been much of a priority, since insert-based data
> export is so slow to reload that no sensible person uses it for
> big tables anyway ;-)

Well, there's slow and there's something that works. But thanks
for clearing this up because it's not mentioned anywhere...
--
Martijn van Oosterhout <kleptog(at)cupid(dot)suninternet(dot)com>
http://cupid.suninternet.com/~kleptog/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2000-06-30 15:32:16 Re: Large Tables(>1 Gb)
Previous Message mikeo 2000-06-30 14:47:45 Re: Large Tables(>1 Gb)