Re: How to determine a database is intact?

From: Wes <wespvp(at)syntegra(dot)com>
To: Jan Wieck <JanWieck(at)yahoo(dot)com>
Cc: Richard Huxton <dev(at)archonet(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to determine a database is intact?
Date: 2004-09-04 20:51:24
Message-ID: BD5F94FC.91CE%wespvp@syntegra.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck(at)yahoo(dot)com> wrote:

> Why isn't doing a restore of those reasonable?

Because of the size and time required. Right now, it takes at least 24
hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID,
etc). That is going to do nothing but increase. Extrapolating linearly the
*current* load, it will take at least 4 days to load when the database size
peaks. But, based on past experience, the load will increase significantly
before then (probably by a factor of 2-4 to be conservative). When the
database gets that large, I have to consider that the reload time may not
increase linearly with the size. If we do a pg_dumpall once a week, it will
take longer to do a reload than the period between dumps. Just the pg_dump
alone could easily take close to a day. It also requires we have duplicate
fully configured hardware for each copy of the database we run just to
verify a pg_dumpall - if it takes that long to load, I can't share hardware.
Add to that the people time to monitor the systems and the process of 2x the
hardware... In short, I need a solution that scales to huge databases
(hundreds of gigabytes to over a terabyte), not one that works just for
small to medium databases.

pg_dumpall is hopefully reliable will presumably give me a snapshot that I
know I can restore from if the database becomes hopelessly corrupt. But I
can't individually test each one. I was hoping for a utility that would go
through and verify all indexes and data are consistent, and if not, attempt
to correct them.

As for your earlier question of cascading errors, consider a file system - a
type of database. If you get a file system error and correct it quickly,
you usually will lose nothing. If, however, you ignore that error, it is
likely to get worse over days or weeks. Other errors will crop up as a
result of the bad information in the first one. At some point, the file
system corruption may become so bad that it can't be recovered. Format and
reload. I have seen this on NTFS, UFS, HFS/HFS+, and even ReiserFS.
Journaling greatly reduces, but doesn't eliminate, this problem. There are
tools that will scan your file system and guarantee it's integrity, or fix
the errors (or attempt to fix them) if it finds any. I was looking for
something similar for a Postgres database.

Wes

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Google Mike 2004-09-04 20:53:48 HOWTO: Get a table or database definition
Previous Message Russ Brown 2004-09-04 20:02:54 Re: psql leaking?