How to design for cheap reliability with PostgreSQL?

From: "Michael Meier" <mikem934(at)googlemail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: How to design for cheap reliability with PostgreSQL?
Date: 2008-11-28 11:03:08
Message-ID: 317da2200811280303xeaf9443ya449c8ededad13e2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I am in the process of designing a distributed application (Java EE 5)
which has demand for database functionality at several points. Some of
these databases need to provide really mission critical reliability,
i.e. it would be the end of our world, if we would suffer from data
loss or silent data corruption. Availabilty or servicability are nice
but not that important. This means that it isn't that important that a
transaction is committed, but if the transaction is committed, the
data should be 100% safe - without any chance of data loss or
corruption.

On the other hand, real proper reliability is expensive and we do not
have the money for serious hardware or "unbreakable" database
licences. I assume a DB size of < 50GB with moderate write I/O and
less moderate read I/O, so I would like to get a machine with 64 GB
Ram for in-memory caching.

I am thinking of PostgreSQL 8.3.n with n>=5 on top of Solaris 10 with ZFS.

So an extremely cheap low end hardware platform for the problem might
look like this:
Dual Quad Core Xeon Machine
64 GB FB-DIMM ECC-Ram
Two 8-Port SAS Controller
Internal 16 x 73 GB 15k hard Drives, partitioned as follows:
- 4 drives: 3-way raid-1 zfs mirror with one additional hot spare for Solaris
- 6 drives: 4-way raid-1 zfs mirror with two additional hot spares for
Postgresql Tables
- 6 drives: 4-way raid-1 zfs mirror with two additional hot spares for
Postgresql Logging
ZFS-Scrubbing every night.
approx. 24.000$ list price from Sun (x4250) with Sun Platinum support
for the hardware and Solaris,
or approx. 12.000$ self-assembled from newegg.

And an extremely expensive high-end solution for the problem might
look like this:
Sun SPARC Enterprise M4000 Server with two SPARC64 VII CPUs
64GB registered ECC-Ram
Sun M-Series RAS-Features (Checksumming of CPU-Registers, etc.)
An external Drive Array (e.g. J4400) with 16 drives. Partioned like
the internal drives above.
ZFS-Scrubbing every night.
approx. 160.000$

This insane price difference would get us data integrity beyond the
ZFS checksumming feature: It would protect the data even in the CPU
registers.

So the questions are:

1) Is this necessary? Do bit errors happen with configurations like
the cheap xeon one above?
2) If this is necessary, is there any (software) way around it? Is
there a PostgreSQL clustering solution available, where the cluster
nodes check each others data integrity?

Thanks a lot for any hints!

Michael

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Willy-Bas Loos 2008-11-28 11:04:37 Re: Separate Sessions?? (View data <-> Query tool)
Previous Message elekis 2008-11-28 10:56:29 Re: [NOOB] try tu put a number with pqexecprepare