From: | "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Alvaro Herrera <alvherre(at)atentus(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Database corruption? |
Date: | 2001-10-23 22:52:30 |
Message-ID: | 3705826352029646A3E91C53F7189E325183E5@sectorbase2.sectorbase.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
> >> Um, Vadim? Still of the opinion that elog(STOP) is a good
> >> idea here? That's two people now for whom that decision has
> >> turned localized corruption into complete database failure.
> >> I don't think it's a good tradeoff.
>
> > One is able to use pg_resetxlog so I don't see point in
> > removing elog(STOP) there. What do you think?
>
> Well, pg_resetxlog would get around the symptom, but at the cost of
> possibly losing updates that are further along in the xlog than the
> update for the corrupted page. (I'm assuming that the problem here
> is a page with a corrupt LSN.) I think it's better to treat flush
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
On restart, entire content of all modified after last checkpoint pages
should be restored from WAL. In Denis case it looks like newly allocated
for update page was somehow corrupted before heapam.c:2235 (7.1.2 src)
and so there was no XLOG_HEAP_INIT_PAGE flag in WAL record => page
content was not initialized on restart. Denis reported system crash -
very likely due to memory problem.
> request past end of log as a DEBUG or NOTICE condition and keep going.
> Sure, it indicates badness somewhere, but we should try to have some
> robustness in the face of that badness. I do not see any reason why
> XLOG has to declare defeat and go home because of this condition.
Ok - what about setting some flag there on restart and abort restart
after all records from WAL applied? So DBA will have choice either
to run pg_resetxlog after that and try to dump data or restore from
old backup. I still object just NOTICE there - easy to miss it. And
in normal processing mode I'd leave elog(STOP) there.
Vadim
P.S. Further discussions will be in hackers-list, sorry.
From | Date | Subject | |
---|---|---|---|
Next Message | Martín Marqués | 2001-10-23 22:54:15 | poor with mirrors |
Previous Message | Markus Meyer | 2001-10-23 22:49:17 | Re: Case problem |
From | Date | Subject | |
---|---|---|---|
Next Message | bpalmer | 2001-10-23 23:03:46 | Re: autoconf taking forever? |
Previous Message | Bruce Momjian | 2001-10-23 21:25:42 | Re: [GENERAL] To Postgres Devs : Wouldn't changing the selectlimit |