Re: Database corruption?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, Vadim Mikheev <vmikheev(at)sectorbase(dot)com>
Subject: Re: Database corruption?
Date: 2001-10-23 00:14:49
Message-ID: 28353.1003796089@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Alvaro Herrera <alvherre(at)atentus(dot)com> writes:
> FATAL 2: XLogFlush: request is not satisfied

We had a previous report of this same failure message --- see
the thread starting at
http://fts.postgresql.org/db/mw/msg.html?mid=1033586

> And here is a backtrace taken from a core file I found laying around,
> which has a timestamp makes me think it has something to say:

> (gdb) bt
> #0 0x4018cbf4 in memmove () from /lib/libc.so.6
> #1 0x08100f85 in PageRepairFragmentation ()
> #2 0x080ae9a7 in scan_heap ()
> #3 0x080adfb4 in vacuum_rel ()
> #4 0x080adbee in vac_vacuum ()
> #5 0x080adb68 in vacuum ()

It would be useful to look into that too, for sure, but I think it is
probably not related to your XLog problem.

> The database has been running for months without trouble. I'm now trying
> desperate measures, but I fear I will have to restore from backup (a week
> old). I have taken a tarball of the complete location (pg_xlog included and
> all that stuff) if anyone wants to see it (but it's 2 GB).

As I said to Denis in the earlier thread, it would be good to try to
track down which page is corrupted and maybe then we'd understand how
it got that way. Since you have the database tarball, you have the
raw material to look into it --- you'd need to rebuild Postgres with
debug symbols enabled and trace back from the failure points to learn
more. Are you up to that, or could you grant access to your machine to
someone who is?

As for your immediate problem, I'd counsel reducing that elog(STOP) to
elog(DEBUG) so that you can bring the database up, and then you can
try to pg_dump your current data. You'll probably still want to
re-initdb and restore once you get a consistent dump.

Um, Vadim? Still of the opinion that elog(STOP) is a good idea here?
That's two people now for whom that decision has turned localized
corruption into complete database failure. I don't think it's a good
tradeoff.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mikheev, Vadim 2001-10-23 00:25:55 Re: Database corruption?
Previous Message Mikheev, Vadim 2001-10-23 00:03:55 Re: Database corruption?