Re: corruption issue after server crash - ERROR: unexpected chunk number 0

From: Mike Broers <mbroers(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: corruption issue after server crash - ERROR: unexpected chunk number 0
Date: 2013-11-21 23:01:09
Message-ID: CAB9893hjbDNCgkC-BPBns=jHyN1O3uYJyiS9TN8A7DX=Vvq3Ew@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks, after this pg_dumpall I am going to see what kind of impact I can
expect from running VACUUM FREEZE ANALYZE (normally I just run vacuumdb
-avz nightly via a cron job) and schedule time to run this in production
against all the tables in the database. Is there anything I should look
out for with vacuum freeze?

Much appreciated,
Mike

On Thu, Nov 21, 2013 at 4:51 PM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

> Mike Broers <mbroers(at)gmail(dot)com> wrote:
>
> > Thanks for the response. fsync and full_page_writes are both on.
>
> > [ corruption appeared following power loss on the machine hosing
> > the VM running PostgreSQL ]
>
> That leaves three possibilities:
> (1) fsync doesn't actually guarantee persistence in your stack.
> (2) There is a hardware problem which has not been recognized.
> (3) There is a so-far unrecognized bug in PostgreSQL.
>
> Based on my personal experience, those are listed in descending
> order of probability. I seem to recall reports of some VM for
> which an fsync did not force data all the way to persistent
> storage, but I don't recall which one. You might want to talk to
> your service provider about what guarantees they make in this
> regard.
>
> > Is there something else I can run to confirm we are more or less
> > ok at the database level after the pg_dumpall or is there no way
> > to be sure and a fresh initdb is required.
>
> Given that you had persistence options in their default state of
> "on", and the corruption appeared after a power failure in a VM
> environment, I would guess that the damage is probably limited.
> That said, damage from this sort of event can remain hidden and
> cause data loss later. Unfortunately we do not yet have a
> consistency checker that can root out such problems. If you can
> arrange a maintenance window to dump and load to a fresh initdb,
> that would eliminate the possibility that some hidden corruption is
> lurking. If that is not possible, running VACUUM FREEZE ANALYZE
> will reduce the number of things that can go wrong, without
> requiring down time.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2013-11-21 23:09:55 Re: corruption issue after server crash - ERROR: unexpected chunk number 0
Previous Message Kevin Grittner 2013-11-21 22:51:53 Re: corruption issue after server crash - ERROR: unexpected chunk number 0