Re: Backup strategies with significant bytea data

From: "Leigh Dyer" <Leigh(dot)Dyer(at)healthscope(dot)com(dot)au>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Backup strategies with significant bytea data
Date: 2010-01-14 04:35:07
Message-ID: 4B4F39A8.20AD.00BA.0@healthscope.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ivan Voras wrote:
> Leigh Dyer wrote:
>> Hi,
>>
>>For years now I've simply backed up my databases by doing a nightly
>>pg_dump, but since we added the ability for users to import binary
files
>>in to our application, which are stored in a bytea fields, the dump
>>sizes have gone through the roof — even with gzip compression,
they're
>>significantly larger than the on-disk size of the database. My guess
is
>>that this due to the way that the binary data from the bytea fields
is
>>encoded in the dump file when it's produced.
>
>Have you tried another dump format? E.g. "-F c" argument to pg_dump?

Hi Ivan,

I have tried the custom dump format, but it made no difference to the
backup sizes. Here are some numbers:

Database size on disk (according to pg_database_size): 1017MB
pg_dump size (standard format): 3475MB
pg_dump size (standard format, gzip-compressed): 1132MB
pg_dump size (custom format, ie: "-F c" option): 1134MB

This testing was done on my development server, with the database
restored a backup from the production server, so the database size on
disk might be a bit smaller than it would be if this was a running
database with some dead tuples in there. Still, there's a very big
difference between the raw data size and the dump sizes.

Thanks
Leigh

Please consider the environment before printing this message

Browse pgsql-general by date

  From Date Subject
Next Message Yan Cheng Cheok 2010-01-14 05:26:30 Re: Extremely Slow Cascade Delete Operation
Previous Message Craig Ringer 2010-01-14 02:41:35 Re: Collate order on Mac OS X, text with diacritics in UTF-8