Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",
Date: 2005-10-28 17:26:03
Message-ID: 20051028172602.GH13187@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 28, 2005 at 02:26:31PM +1000, Gavin Sherry wrote:
> Have spoken with Jim on IRC, he says that there have been several crashes
> recently due to a faulty disk array. I guess the zeroing could be an
> outcome of the faulty disk. I wonder if the crash the faulty disk resulted
> in could have been caused some where around mdextend() where we create a
> zero'd page but before we could have written out the initialised page.

Just to clarify, there's no evidence that the array is faulty. I do know
that they were using write-back with a non-battery-backed cache though.

What has been happening is periodic random crashes, around 1 a week. I
now have a good core for one, as well as an assert:

TRAP: FailedAssertion("!(shared->page_number[slotno] == pageno &&
shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS)", File:
"slru.c", Line: 308)

I haven't looked at that code yet, so I have no idea what that actually
means. Let me know what info y'all would like to see out of the core.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-10-28 17:32:52 Re: [GENERAL] aix build question re: duplicate symbol warning
Previous Message Alvaro Herrera 2005-10-28 16:52:25 Re: ERROR: invalid memory alloc request size <a_big_number_here>