Skip site navigation (1) Skip section navigation (2)

Re: 7.0.2 crash (maybe linux kernel bug??)

From: Alfred Perlstein <bright(at)wintelcom(dot)net>
To: Michael J Schout <mschout(at)gkg(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 7.0.2 crash (maybe linux kernel bug??)
Date: 2000-10-31 19:59:37
Message-ID: 20001031115937.Z22110@fw.wintelcom.net (view raw or flat)
Thread:
Lists: pgsql-hackers
* Michael J Schout <mschout(at)gkg(dot)net> [001031 11:22] wrote:
> Hi.
> 
> Ive had a crash in postgresql 7.0.2.  Looking at what happened, I actually
> suspect that this is a filesystem bug, and not a postgresql bug necessarily,
> but I wanted to report it here and see if anyone else had any opinions.
> 
> The platform this happened on was linux (redhat 6.2), kernel 2.2.16 (SMP) dual
> pentium III 500MHz cpus, Mylex DAC960 raid controller running in raid5 mode.
> 
> During regular activity, I got a kernel oops.  Looking at the call trace from
> the kernel, as well as the EIP, I think maybe there is a bug here int the fs
> buffer code, and that htis is a linux kernel problem (not a postgresql
> problem).
> 
> Bug I'm no expert here.. Does this sould correct looking at the kernel erros
> below?
> 
> Sorry if this is off topic.  I just want to make sure this is a kernel bug and
> not a postgresql bug.
> 
> Mike
> 
> The oopses:
> 
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000134 
> kernel: current->tss.cr3 = 1a325000, %%cr3 = 1a325000 
> kernel: *pde = 00000000 
> kernel: Oops: 0002 
> kernel: CPU:    0 
> kernel: EIP:    0010:[remove_from_queues+169/328] 
> kernel: EFLAGS: 00010206 
> kernel: eax: 00000100   ebx: 00000002   ecx: df022e40   edx: efba76b8 
> kernel: esi: df022e40   edi: 00000000   ebp: 00000000   esp: da327ea4 
> kernel: ds: 0018   es: 0018   ss: 0018 
> kernel: Process postmaster (pid: 11527, process nr: 51, stackpage=da327000) 
> kernel: Stack: df022e40 c012be79 df022e40 df022e40 00001000 c0142cb8 c0142cc7 df022e40  
> kernel:        ec247140 ffffffea ec0b026c da326000 df022e40 df022e40 df022e40 000a4000  
> kernel:        00000000 da327f08 00000000 00000000 eff29200 00001000 000000a5 000a5000  
> kernel: Call Trace: [refile_buffer+77/184] [ext2_file_write+996/1584] [ext2_file_write+1011/1584] [kfree_skbmem+51/64] [__kfree_skb+162/168] [lockd:__insmod_lockd_O/lib/modules/2.2.16-3smp/fs/lockd.o_M394EA7+-76392/76] [handle_IRQ_event+90/140]  
> kernel:        [sys_write+240/292] [ext2_file_write+0/1584] [system_call+52/56] [startup_32+43/164]  
> kernel: Code: 89 50 34 c7 01 00 00 00 00 89 02 c7 41 34 00 00 00 00 ff 0d  
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000100 

Yes, your kernel basically segfaulted, I would get a traceback from your
crashdump and discuss it with the kernel developers.

--
-Alfred Perlstein - [bright(at)wintelcom(dot)net|alfred(at)freebsd(dot)org]
"I have the heart of a child; I keep it in a jar on my desk."


> kernel: current->tss.cr3 = 1ba46000, %%cr3 = 1ba46000 
> kernel: *pde = 00000000 
> kernel: Oops: 0000 
> kernel: CPU:    1 
> kernel: EIP:    0010:[find_buffer+104/144] 
> kernel: EFLAGS: 00010206 
> kernel: eax: 00000100   ebx: 00000007   ecx: 00069dae   edx: 00000100 
> kernel: esi: 0000000d   edi: 00003006   ebp: 0005ce4b   esp: e53a19f4 
> kernel: ds: 0018   es: 0018   ss: 0018 
> kernel: Process postmaster (pid: 5545, process nr: 37, stackpage=e53a1000) 
> kernel: Stack: 0005ce4b 00003006 00069dae c012b953 00003006 0005ce4b 00001000 c012bcc6  
> kernel:        00003006 0005ce4b 00001000 00003006 eff29200 00003006 00004e4b ef18c960  
> kernel:        c0141ee7 00003006 0005ce4b 00001000 0005ce4b e53a1bb0 edc3c660 edc3c660  
> kernel: Call Trace: [get_hash_table+23/36] [getblk+30/324] [ext2_new_block+2291/2756] [getblk+271/324] [ext2_alloc_block+344/356] [block_getblk+305/624] [ext2_getblk+256/524]  
> kernel:        [ext2_file_write+1308/1584] [__brelse+19/84] [permission+36/248] [dump_seek+53/104] [dump_seek+53/104] [dump_write+48/84] [elf_core_dump+3104/3216] [do_IRQ+82/92]  
> kernel:        [tcp_write_xmit+407/472] [__release_sock+36/124] [tcp_do_sendmsg+2125/2144] [inet_sendmsg+0/144] [cprt+1553/20096] [cprt+1553/20096] [cprt+1553/20096] [do_signal+458/724]  
> kernel:        [force_sig_info+168/180] [force_sig+17/24] [do_general_protection+54/160] [error_code+45/52] [signal_return+20/24]  
> kernel: Code: 8b 00 39 6a 04 75 15 8b 4c 24 20 39 4a 08 75 0c 66 39 7a 0c  

In response to

pgsql-hackers by date

Next:From: Steve WolfeDate: 2000-10-31 20:02:02
Subject: Re: how good is PostgreSQL
Previous:From: Michael J SchoutDate: 2000-10-31 19:16:09
Subject: 7.0.2 crash (maybe linux kernel bug??)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group