PANIC: corrupted item lengths

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: PANIC: corrupted item lengths
Date: 2009-06-04 10:48:47
Message-ID: 1244112527.23910.189.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I've seen a couple of *topic issues lately.

What seems strange about the various errors generated in bufpage.c is
that they are marked as ERRORs, yet are executed within a critical
section causing the system to PANIC.

There are a number of sanity checks that are made prior to any changes
taking place, so they need not generate PANICs. *topic is one such error
message but there are others. There is no need to take down the whole
server just because one block has a corruption. I'm not advocating that
corruptions are acceptable, but the server provides no way to remove the
corruption, which is a problem and server really should keep a better
grip on its towel in any case.

I would much prefer:

* VACUUMs seeing these errors would perform as with zero_damaged_pages.
* other backends seeing those errors should just ERROR out.

We can do this by having a new function: boolean PageIsValid() which
performs the sanity checks. This can then be executed by
heap_page_prune() prior to entering the critical section. That will then
be called correctly by both VACUUM and other code. VACUUM can then
optionally zero out the block, as is done with PageHeaderIsValid().

Votes?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kolb, Harald (NSN - DE/Munich) 2009-06-04 10:53:26 Re: Synchronous replication: status of standby side
Previous Message Dave Page 2009-06-04 09:49:29 Re: It's June 1; do you know where your release is?