Re: Backup Policy & Disk Space Issues

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Volkan YAZICI <yazicivo(at)ttmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Backup Policy & Disk Space Issues
Date: 2008-12-23 02:27:57
Message-ID: 49504CAD.5020805@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Volkan YAZICI wrote:
> On Mon, 22 Dec 2008, David Fetter <david(at)fetter(dot)org> writes:
>> On Mon, Dec 22, 2008 at 10:07:21AM +0200, Volkan YAZICI wrote:
>>> 15x4250 = 63750 = 62.25TB
>> SATA disk space is quite cheap these days, so unless something is very
>> badly wrong with your funding model, this is not really a problem.
>
> Umm... A minority of the servers have SATA interface. (Most of 'em use
> SAS drives and SAN systems.)

Yes... but you can buy new SATA storage enclosures or storage servers.
SATA storage enclosures with SAS and/or Fibre Channel interfaces to the
host exist, and are suitable for exactly this sort of bulk
low-performance archival storage role. You have enough data that you
should be expecting to spend a bit on backups I'm afraid.

SATA storage arrays might not be ideal for your long-term storage full
backups, but they're perfect for the storage of WAL archives &
snapshots, and for the shorter-lived backups that you'll periodically
rotate out.

I built an 8TB storage server for the (small) company I work for at a
pitiful cost to ensure that we always had at least two versions of all
backups in storage that was reliable*, immediately accessible, and
encrypted in case of theft. It's not the only backup mechanism, but it's
the main one and by its self is adequate for all but the most critical
data. It's hard to overemphasise the benefits its had in terms of
improved backup reliability and quick access to backups.

About 100tb, which is about what I'd plan for in your case, is ... more
expensive. That said, with redundancy within each enclosure and between
them it'd be a pretty solid way to store your backups. It helps that you
may not want to store your long-term archival backups on SATA arrays,
and it's also not clear to what extent you've investigated options for
reducing your backup sizes in the first place. 40-50 TB is not an
unreasonable amount of storage to pick up in the form of arrays of large
external SATA enclosures.

In particular, if you're backing up the database cluster at the file
system level, you might want to look into using dumps for your
longer-lived backups instead. For one thing, a compressed dump tends to
be a LOT smaller then a filesystem-level cluster backup of a Pg cluster,
and for another you protect yourself against most forms of undiscovered
corruption in the cluster.

If you do go for SATA storage, avoid systems that rely on SATA
multiplexers if possible. They're REALLY slow, and are particularly
awful in RAID environments. Given that alternatives that have many SATA
interfaces and a single SAS port for the host interface exist, as do
internally RAID-ed Fibre Channel options, multiplexer based systems
don't seem worth it.

* with RAID and proper array scrubbing on a server attached to a UPS
it's WAY more reliable than the previous DDS-4 DAT backups. It also has
the advantage of not needing five or six tapes per day and operating
completely unattended, so risk of human error is drastically reduced.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-12-23 05:15:32 Re: Using the wrong index (very suboptimal), why?
Previous Message Grzegorz Jaśkiewicz 2008-12-22 21:08:56 Re: lack of consequence with domains and types