Re: database corruption questions

From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Daniel Serodio <dserodio(at)mandic(dot)com(dot)br>
Cc: Heine Ferreira <heine(dot)ferreira(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: database corruption questions
Date: 2012-10-18 00:32:31
Message-ID: 507F4E1F.5000105@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10/18/2012 01:06 AM, Daniel Serodio wrote:
> Craig Ringer wrote:
>> On 10/14/2012 05:53 AM, Heine Ferreira wrote:
>>> Hi
>>>
>>> Are there any best practices for avoiding database
>>> corruption?
>>
>> * Maintain rolling backups with proper ageing. For example, keep one a
>> day for the last 7 days, then one a week for the last 4 weeks, then
>> one a month for the rest of the year, then one a year.
> What kind of "rolling backups"? From pg_basebackup?

I'd recommend good old `pg_dump`, that way you're not assuming that your
cluster's on-disk format is intact and happy. Regular dumps will also
help detect any damage that might've crept in from file system
corruption, HDD/RAID faults, etc. Not that that should happen, but we're
talking preventative action here.

I elaborated somewhat here:


http://blog.ringerc.id.au/2012/10/avoiding-postgresql-database-corruption.html

Alternately, check out barman: http://www.pgbarman.org/ . I haven't
tried it yet, but it looks very promising. I'd still want to take
periodic dumps, as I'm reluctant to rely on `pg_basebackup` style
cluster copies alone.

>> * Keep up to date with the latest PostgreSQL patch releases. Don't be one of those people still running 9.0.0 when 9.0.10 is out.
> The problem is that updating the database usually results in downtime. Or can the downtime be avoided in a replication scenario?

Patches within the same minor release require extremely minimal downtime.

- Stop server
- Install new binaries
- Start server

How much downtime will a data corruption bug (yes, they've happened)
that's fixed in a new version cost you if you don't patch and it bites
you? Or a bug that causes a server crash and restart? Plan downtime, so
you don't have potentially much longer unplanned downtime at the worst
possible time.

You can do zero-downtime minor updates using hot standby and standby
promotion; see http://www.repmgr.org/ .

Updating to a new major release is a bigger job, but that's not what I'm
talking about.

BTW, please trim your replies to quote just the relevant context.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2012-10-18 04:08:36 Re: 9.1/9.2 SERIALIZABLE: expected serialization failure between INSERT and SELECT not encountered
Previous Message Craig Ringer 2012-10-17 22:50:34 Re: allow servers to access to the same data