Re: 7.0.2 crash (maybe linux kernel bug??)

From: Alfred Perlstein <bright(at)wintelcom(dot)net>
To: Michael J Schout <mschout(at)gkg(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 7.0.2 crash (maybe linux kernel bug??)
Date: 2000-10-31 19:59:37
Message-ID: 20001031115937.Z22110@fw.wintelcom.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Michael J Schout <mschout(at)gkg(dot)net> [001031 11:22] wrote:
> Hi.
>
> Ive had a crash in postgresql 7.0.2. Looking at what happened, I actually
> suspect that this is a filesystem bug, and not a postgresql bug necessarily,
> but I wanted to report it here and see if anyone else had any opinions.
>
> The platform this happened on was linux (redhat 6.2), kernel 2.2.16 (SMP) dual
> pentium III 500MHz cpus, Mylex DAC960 raid controller running in raid5 mode.
>
> During regular activity, I got a kernel oops. Looking at the call trace from
> the kernel, as well as the EIP, I think maybe there is a bug here int the fs
> buffer code, and that htis is a linux kernel problem (not a postgresql
> problem).
>
> Bug I'm no expert here.. Does this sould correct looking at the kernel erros
> below?
>
> Sorry if this is off topic. I just want to make sure this is a kernel bug and
> not a postgresql bug.
>
> Mike
>
> The oopses:
>
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000134
> kernel: current->tss.cr3 = 1a325000, %%cr3 = 1a325000
> kernel: *pde = 00000000
> kernel: Oops: 0002
> kernel: CPU: 0
> kernel: EIP: 0010:[remove_from_queues+169/328]
> kernel: EFLAGS: 00010206
> kernel: eax: 00000100 ebx: 00000002 ecx: df022e40 edx: efba76b8
> kernel: esi: df022e40 edi: 00000000 ebp: 00000000 esp: da327ea4
> kernel: ds: 0018 es: 0018 ss: 0018
> kernel: Process postmaster (pid: 11527, process nr: 51, stackpage=da327000)
> kernel: Stack: df022e40 c012be79 df022e40 df022e40 00001000 c0142cb8 c0142cc7 df022e40
> kernel: ec247140 ffffffea ec0b026c da326000 df022e40 df022e40 df022e40 000a4000
> kernel: 00000000 da327f08 00000000 00000000 eff29200 00001000 000000a5 000a5000
> kernel: Call Trace: [refile_buffer+77/184] [ext2_file_write+996/1584] [ext2_file_write+1011/1584] [kfree_skbmem+51/64] [__kfree_skb+162/168] [lockd:__insmod_lockd_O/lib/modules/2.2.16-3smp/fs/lockd.o_M394EA7+-76392/76] [handle_IRQ_event+90/140]
> kernel: [sys_write+240/292] [ext2_file_write+0/1584] [system_call+52/56] [startup_32+43/164]
> kernel: Code: 89 50 34 c7 01 00 00 00 00 89 02 c7 41 34 00 00 00 00 ff 0d
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000100

Yes, your kernel basically segfaulted, I would get a traceback from your
crashdump and discuss it with the kernel developers.

--
-Alfred Perlstein - [bright(at)wintelcom(dot)net|alfred(at)freebsd(dot)org]
"I have the heart of a child; I keep it in a jar on my desk."

> kernel: current->tss.cr3 = 1ba46000, %%cr3 = 1ba46000
> kernel: *pde = 00000000
> kernel: Oops: 0000
> kernel: CPU: 1
> kernel: EIP: 0010:[find_buffer+104/144]
> kernel: EFLAGS: 00010206
> kernel: eax: 00000100 ebx: 00000007 ecx: 00069dae edx: 00000100
> kernel: esi: 0000000d edi: 00003006 ebp: 0005ce4b esp: e53a19f4
> kernel: ds: 0018 es: 0018 ss: 0018
> kernel: Process postmaster (pid: 5545, process nr: 37, stackpage=e53a1000)
> kernel: Stack: 0005ce4b 00003006 00069dae c012b953 00003006 0005ce4b 00001000 c012bcc6
> kernel: 00003006 0005ce4b 00001000 00003006 eff29200 00003006 00004e4b ef18c960
> kernel: c0141ee7 00003006 0005ce4b 00001000 0005ce4b e53a1bb0 edc3c660 edc3c660
> kernel: Call Trace: [get_hash_table+23/36] [getblk+30/324] [ext2_new_block+2291/2756] [getblk+271/324] [ext2_alloc_block+344/356] [block_getblk+305/624] [ext2_getblk+256/524]
> kernel: [ext2_file_write+1308/1584] [__brelse+19/84] [permission+36/248] [dump_seek+53/104] [dump_seek+53/104] [dump_write+48/84] [elf_core_dump+3104/3216] [do_IRQ+82/92]
> kernel: [tcp_write_xmit+407/472] [__release_sock+36/124] [tcp_do_sendmsg+2125/2144] [inet_sendmsg+0/144] [cprt+1553/20096] [cprt+1553/20096] [cprt+1553/20096] [do_signal+458/724]
> kernel: [force_sig_info+168/180] [force_sig+17/24] [do_general_protection+54/160] [error_code+45/52] [signal_return+20/24]
> kernel: Code: 8b 00 39 6a 04 75 15 8b 4c 24 20 39 4a 08 75 0c 66 39 7a 0c

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Steve Wolfe 2000-10-31 20:02:02 Re: how good is PostgreSQL
Previous Message Michael J Schout 2000-10-31 19:16:09 7.0.2 crash (maybe linux kernel bug??)