Re: incremental backup of postgres database?

From: Palle Girgensohn <girgen(at)pingpong(dot)net>
To: Gary Stainburn <gary(dot)stainburn(at)ringways(dot)co(dot)uk>, pgsql-admin(at)postgresql(dot)org
Subject: Re: incremental backup of postgres database?
Date: 2003-02-07 21:00:31
Message-ID: 15640000.1044651631@palle.girgensohn.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

--On fredag, februari 07, 2003 16.49.38 +0000 Gary Stainburn
<gary(dot)stainburn(at)ringways(dot)co(dot)uk> wrote:

> On Friday 07 Feb 2003 4:05 pm, Palle Girgensohn wrote:
>> Hi!
>>
>> What would be the best suggestion for incremental backup of a rather
>> large database, where the bulk data volume consists of large objects.
>> Since backup will be transmitted over a 2 Mbit/s internet line, we need
>> to minimize the data flow for each nightly backup. The compressed
>> database dump file, when dumped with pg_dump -F c -b, is roughly 2.5 GB,
>> whereas a dump without large objects is roughly is only 2% that size. I
>> can live with having to transfer the BLOB-less dump every night, but not
>> several gigabytes of data...
>>
>> So, I will either need to find a way to get the latest data (I have
>> timestamps for all LOBS) and somehow get it to a file in a restorable
>> format... One simple way would be to select all new blobs into a temp
>> file and copy that table to a backup file
>>
>> or
>>
>> replicate the database in real time to the backup site, using one of the
>> replication projects? How robust are the replication systems today? What
>> will happen if the 2Mb/s line fails temporarily?
>>
>> Perhaps there are other ideas for incremental backup of postgres
>> databases? Your input would be much appreciated.
>>
>> Thanks
>> Palle
>
> Just of the top of my head, have you looked at 'diff'ing the archive, e.g.
>
> diff yesterday.sql today.sql |gzip -c >diff.sql.gz
>
> to see what sort of file you'll be looking at. I'm looking at using a
> similar system here, although I don't have LOBS.

The though has accourred to me, but the files are in binary format, so some
sort of binary diff program, like xdiff, would be needed. Also, I think, in
the dumps, the tuples are in an nondeterministic order, and this order
might differ after each vaccum. This will make diffs rather large as well.

But hey, I'll check it, it might be worthwile.

/Palle

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Charles Hornberger 2003-02-08 01:12:07 PostgreSQL 7.2 + PAM = authentication failure?
Previous Message Lamar Owen 2003-02-07 19:26:53 Re: Problem installing Postgresql 7.3.1 on RedHat 8.0