Re: How to increace nightly backup speed

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: How to increace nightly backup speed
Date: 2006-11-29 21:22:44
Message-ID: 60mz6a7xsr.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

vivek(at)khera(dot)org (Vivek Khera) writes:
> On Nov 28, 2006, at 11:11 AM, Andrus wrote:
>
>> 1. My database size seems to be appox 1 GB and download speed is
>> approx 600
>> kb/s. Your solution requires 4.5 hours download time
>> since 1 GB of data must be downloaded.
>
> If you're running pg_dump on a remote host, you're transferring the
> data over the pipe and compressing locally, since the pg wire
> protocol is not compressed. The compression time is probably not
> causing any slowness unless your local CPU is incredibly slow and
> can't keep up with the data streaming in at that low speed.
>
> I don't see how you can improve your download speed without doing
> compression at the other end to reduce the number of bits you have to
> push through your network.

... And if the network is pretty fast, the amount of CPU that
compression eats is likely to worsen the speed at which data gets
transferred.

> SSH seems to be a resonable solution to this (run dump + compress on
> remote host, then copy data over), but if you rule out anything that
> doesn't go over port 5432 then I think you're out of luck...
>
> Well, one thing... is there another host on the remote LAN to which
> you can ssh? If so, then use SSH port-forwarding and enable
> compression on the ssh connection to that host, then connect to
> postgres via the forwarded port to do your dump locally. The data
> will be compressed on the wire.

We were finding that some of our bigger backups were, due to bzip2
cost, taking ~4h. Dumping to a file turned this into "dump for 1h,
compress for 3" which cut down the length of the transaction.

Our need did involve actually needing bzip2, as there's a later step
where there is a need to transfer data across a much slower network
connection.

Some testing with varying bzip2 and gzip options showed that
compression was pretty sure to be real expensive if used in the
initial "processing pipeline."

There are other options out there that could conceivably change the
price of compression, such as:

http://www.lzop.org/
http://www.quicklz.com/

Of course, those are not as well known compression systems, and so are
not as well trusted. Maybe worth looking into, tho.
--
"cbbrowne","@","linuxdatabases.info"
http://cbbrowne.com/info/rdbms.html Rules of the Evil Overlord #135.
"My doomsday machine will have the advanced technological device
called a capacitor just in case someone inconveniently pulls the plug
at the last moment. (If I have access to REALLY advanced technology, I
will include the even better back-up device known as the "battery.")"
<http://www.eviloverlord.com/>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2006-11-29 21:34:31 Re: coalesce with all nulls can only be assigned to
Previous Message Thomas Kellerer 2006-11-29 21:13:54 Re: Only MONO/WinForms is a way to go