Quick Links

Re: Backup strategies with significant bytea data

From:	"Leigh Dyer" <Leigh(dot)Dyer(at)healthscope(dot)com(dot)au>
To:	<pgsql-general(at)postgresql(dot)org>
Subject:	Re: Backup strategies with significant bytea data
Date:	2010-01-14 04:35:07
Message-ID:	4B4F39A8.20AD.00BA.0@healthscope.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Ivan Voras wrote:
> Leigh Dyer wrote:
>> Hi,
>>
>>For years now I've simply backed up my databases by doing a nightly
>>pg_dump, but since we added the ability for users to import binary
files
>>in to our application, which are stored in a bytea fields, the dump
>>sizes have gone through the roof — even with gzip compression,
they're
>>significantly larger than the on-disk size of the database. My guess
is
>>that this due to the way that the binary data from the bytea fields
is
>>encoded in the dump file when it's produced.
>
>Have you tried another dump format? E.g. "-F c" argument to pg_dump?

Hi Ivan,

I have tried the custom dump format, but it made no difference to the
backup sizes. Here are some numbers:

Database size on disk (according to pg_database_size): 1017MB
pg_dump size (standard format): 3475MB
pg_dump size (standard format, gzip-compressed): 1132MB
pg_dump size (custom format, ie: "-F c" option): 1134MB

This testing was done on my development server, with the database
restored a backup from the production server, so the database size on
disk might be a bit smaller than it would be if this was a running
database with some dead tuples in there. Still, there's a very big
difference between the raw data size and the dump sizes.

Thanks
Leigh

Please consider the environment before printing this message

Browse pgsql-general by date

	From	Date	Subject
Next Message	Yan Cheng Cheok	2010-01-14 05:26:30	Re: Extremely Slow Cascade Delete Operation
Previous Message	Craig Ringer	2010-01-14 02:41:35	Re: Collate order on Mac OS X, text with diacritics in UTF-8