Postgres WarmStandby using ZFS or Snapshot to create Web DB?

From: Jennifer Spencer <jennifer(at)sun(dot)Stanford(dot)EDU>
To: pgsql-general(at)postgresql(dot)org
Subject: Postgres WarmStandby using ZFS or Snapshot to create Web DB?
Date: 2007-11-29 21:08:21
Message-ID: 474F2A45.5060509@sun.stanford.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am looking for suggestions in setting up a large postgres database scenario. We are running a
science project with a lot of data expected from the science instrument. If you have time to comment,
any advice is most welcome!

Here's the deal:
1. We expect to store ~1TB per year of data for 10 years. Mostly this is a warehouse situation - not a
lot of updates, or deletes, but a lot of inserts.
2. We need to replicate a subset of our postgres data to an international science community, and in
turn take in some data from them (we plan to do this via Slony-1 unless there is a compelling reason
not to).
3. We need to make a copy of our database available to the general internet community.
4. We need to have a Warm Standby available in case of disaster. We plan to use PITR with WAL files
for this (again, unless there is a compelling reason not to).
5. We need to make regular full tape backups (~weekly) and occasionally scheduled maintenance (think
quarterly maintenance).

We do not have an endless budget, sadly, so I could use some help as to how to go about this. Having
gone from a job where my database software actually had paid tech support to one that doesn't
(PostGres), I am pretty concerned about what could go wrong.

Assume our Primary server (A) is good enough to serve our in-house users, and our Warm Standby (B) is
a physical duplicate of A. My plan is to copy WAL files to B. Make a tape backup from B weekly,
keeping it out of recovery mode for ~6 hours, or alternatively make a snapshot of B's data files at a
given time and tape off the snapshot. This takes care of A & B, and the backups, but what about the
other requirements?

How do we get data to our web community w/out fear of hacking to the primary? And how do we do that
economically? There is one plan in place to use a set of snapshot disks from A's data files to act as
the web database's files. Can we do that? Is that exceptionally stupid? Another plan involves using a
Solaris 10 ZFS solution to clone the warm standby B's files to act as a web database's files (see:
http://www.lethargy.org/~jesus/archives ... crack.html for more). I am not sure either one of the
above solutions will work quickly. We'd like a turnaround time from A to B to Web of less than 30
minutes for newly-created tables, or new data in existing tables.

Lastly, we plan to pinhole our firewall for trusted Slony-1 science "customers". People that we
already know who have specific IP addresses. We have yet to figure out the drag to our Primary (A) due
to Slony-1. Any experience with that out there?

My prior work experience involves a 1TB Sybase database, its warm-standby and regular backups &
quarterly maintenance. I am new to PostGres and the idea of no tech support phone calls when things
break is a scary one! I am trying to create a belt-and-suspenders redundant solution so that if
something breaks, I have time to figure out what went wrong and fix it before the users even know
there's a problem.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-11-29 21:11:55 Re: [GENERAL] Empty arrays with ARRAY[]
Previous Message Benoît Carpentier 2007-11-29 20:59:21 free ETL using Postgresql 8.2.x