Skip site navigation (1) Skip section navigation (2)

Re: 9.0.4 Data corruption issue

From: Ken Caruso <ken(at)ipl31(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: 9.0.4 Data corruption issue
Date: 2011-07-17 07:31:41
Message-ID: CAMg8r_r8dqsDtfkqRLu1r_XTxDqmouqnFaArcWhi51SQX_33Tw@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-admin
On Sat, Jul 16, 2011 at 2:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Ken Caruso <ken(at)ipl31(dot)net> writes:
> > Sorry, the actual error reported by CLUSTER is:
>
> > gpup=> cluster verbose tablename;
> > INFO:  clustering "dbname.tablename"
> > WARNING:  could not write block 12125253 of base/2651908/652397108
> > DETAIL:  Multiple failures --- write error might be permanent.
> > ERROR:  could not open file "base/2651908/652397108.1" (target block
> > 12125253): No such file or directory
> > CONTEXT:  writing block 12125253 of relation base/2651908/652397108
>
> Hmm ... it looks like you've got a dirty buffer in shared memory that
> corresponds to a block that no longer exists on disk; in fact, the whole
> table segment it belonged to is gone.  Or maybe the block or file number
> in the shared buffer header is corrupted somehow.
>
> I imagine you're seeing errors like this during each checkpoint attempt?
>

Hi Tom,

Thanks for the reply.

Yes, I tried a pg_start_backup() to force a checkpoint and it failed due to
similar error.


>
> I can't think of any very good way to clean that up.  What I'd try here
> is a forced database shutdown (immediate-mode stop) and see if it starts
> up cleanly.  It might be that whatever caused this has also corrupted
> the back WAL and so WAL replay will result in the same or similar error.
> In that case you'll be forced to do a pg_resetxlog to get the DB to come
> up again.  If so, a dump and reload and some manual consistency checking
> would be indicated :-(
>

Before seeing this message, I restarted Postgres and it was able to get to a
consistent state at which point I reclustered the db without error and
everything appears to be fine. Any idea what caused this? Was it something
to do with the Vacuum Full?

Thanks

-Ken


>
>                        regards, tom lane
>

In response to

Responses

pgsql-admin by date

Next:From: Cédric VillemainDate: 2011-07-17 10:04:59
Subject: Re: 9.0.4 Data corruption issue
Previous:From: Tom LaneDate: 2011-07-16 21:30:30
Subject: Re: 9.0.4 Data corruption issue

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group