Skip site navigation (1) Skip section navigation (2)

Re: [pgadmin-hackers] developer.pgadmin.org/nagios.pgadmin.org - Diskfailure

From: "Dave Page" <dpage(at)vale-housing(dot)co(dot)uk>
To: <blacknoz(at)club-internet(dot)fr>,<dpage(at)vale-housing(dot)co(dot)uk>
Cc: <jam(at)zoidtechnologies(dot)com>,<pgadmin-hackers(at)postgresql(dot)org>,<pgsql-www(at)postgresql(dot)org>
Subject: Re: [pgadmin-hackers] developer.pgadmin.org/nagios.pgadmin.org - Diskfailure
Date: 2006-05-13 19:29:19
Message-ID: 001501c676c3$891ddd89$6a01a8c0@valehousing.co.uk (view raw or flat)
Thread:
Lists: pgadmin-hackerspgsql-www
-----Original Message-----
From: "Raphaël Enrici"<blacknoz(at)club-internet(dot)fr>
Sent: 13/05/06 12:45:59
To: "Dave Page"<dpage(at)vale-housing(dot)co(dot)uk>
Cc: "Jeff MacDonald"<jam(at)zoidtechnologies(dot)com>, "pgadmin-hackers(at)postgresql(dot)org"<pgadmin-hackers(at)postgresql(dot)org>, "PostgreSQL WWW"<pgsql-www(at)postgresql(dot)org>
Subject: Re: [pgadmin-hackers] [pgsql-www] developer.pgadmin.org/nagios.pgadmin.org - Diskfailure

Hi Raph,

>I recently (2 months ago) experienced kernel crash with reiserfs after
> some electrical failure. I solved the problem by doing a full fsck (I
> mean fsck and then a reiserfs rebuild of the tree [dangerous]). It
> worked, at least for me.

Thanks - I'm leaning towards the memory issue atm as it seems to be OK again following a reboot, and the svn repo which previously wouldn't tar or rsync now verifys perfectly and can be tarred up. 

I'll swap the sticks on Monday, and if that doesn't work, then consider a 'full fsck'. If that fails, I guess I'll just move it into the new chassis, and use scp backup to another box until the new scsi cable arrives.

Cheers, Dave

-----Unmodified Original Message-----
Dave Page wrote:
>  
> 
> 
>>-----Original Message-----
>>From: Jeff MacDonald [mailto:jam(at)zoidtechnologies(dot)com] 
>>Sent: 12 May 2006 23:19
>>To: Dave Page
>>Cc: Jeff MacDonald
>>Subject: Re: [pgsql-www] 
>>developer.pgadmin.org/nagios.pgadmin.org - Diskfailure
>>
>>On Fri, 2006-05-12 at 22:47 +0100, Dave Page wrote:
>>
>>>The machine hosting the developer.pgadmin.org and 
>>
>>nagios.pgadmin.org 
>>
>>>vservers is currently having serious filesystem problems, which are 
>>>causing disk intensive operations (like rsync, tar) to segfault for 
>>>currently unknown reasons.
>>
>>do a memory test, swap as needed, see if that solves the 
>>problem.. 
> 
> 
> I'll try just replacing it - I have some unopened sticks for that mobo.
> FWIW, a reboot with a forced fsck found no errors at all and the box is
> currently working OK, but I have now found errors similar to the
> following:
> 
> May 12 21:11:29 barbas rsyncd[32134]: rsync: writefd_unbuffered failed
> to write 4 bytes: phase "send_file_entry" [sender]: Broken pipe (32)
> May 12 21:11:29 barbas rsyncd[32134]: rsync error: error in rsync
> protocol data stream (code 12) at io.c(1126) [sender]
> May 12 22:13:52 barbas kernel:  kernel BUG at page_alloc.c:142!
> May 12 22:13:52 barbas kernel: invalid operand: 0000
> May 12 22:13:52 barbas kernel: CPU:    1
> May 12 22:13:52 barbas kernel: EIP:    0010:[<c013cec0>]    Not tainted
> May 12 22:13:52 barbas kernel: EFLAGS: 00010286
> May 12 22:13:52 barbas kernel: eax: d9e18100   ebx: c262c140   ecx:
> c262c140   edx: 00000000
> May 12 22:13:52 barbas kernel: esi: c262c140   edi: 00000000   ebp:
> 00000000   esp: d50d5edc
> May 12 22:13:52 barbas kernel: ds: 0018   es: 0018   ss: 0018
> May 12 22:13:52 barbas kernel: Process rsync (pid: 32141,
> stackpage=d50d5000)
> May 12 22:13:52 barbas kernel: Stack: d50d5ee8 c0133ab0 00001000
> c262c140 e3a59d44 00006000 c01348e9 00000000
> May 12 22:13:52 barbas kernel:        00000000 00001000 c262c140
> e3a59d44 00000000 c013423d d50d5f7c c262c140
> May 12 22:13:52 barbas kernel:        00000000 00001000 00001000
> 00000001 00000000 0000013b e3a59c80 c01347f0
> May 12 22:13:52 barbas kernel: Call Trace:    [<c0133ab0>] [<c01348e9>]
> [<c013423d>] [<c01347f0>] [<c01347f0>]
> May 12 22:13:52 barbas kernel:   [<c0134a2f>] [<c01347f0>] [<c0145a50>]
> [<c0108fdf>]
> May 12 22:13:52 barbas kernel:
> May 12 22:13:52 barbas kernel: Code: 0f 0b 8e 00 6b ba 37 c0 e9 ba fd ff
> ff 8b 69 60 85 ed 0f 85


Dave,

I recently (2 months ago) experienced kernel crash with reiserfs after
some electrical failure. I solved the problem by doing a full fsck (I
mean fsck and then a reiserfs rebuild of the tree [dangerous]). It
worked, at least for me.

Regards,
Raphaël

pgsql-www by date

Next:From: Dave PageDate: 2006-05-13 19:29:21
Subject: Re: developer.pgadmin.org/nagios.pgadmin.org - Disk failure
Previous:From: Raphaël EnriciDate: 2006-05-13 11:45:54
Subject: Re: [pgadmin-hackers] developer.pgadmin.org/nagios.pgadmin.org

pgadmin-hackers by date

Next:From: Raphaël EnriciDate: 2006-05-14 09:56:17
Subject: Re: Bug#364787: pgadmin3: pressing delete key on
Previous:From: Raphaël EnriciDate: 2006-05-13 11:45:54
Subject: Re: [pgadmin-hackers] developer.pgadmin.org/nagios.pgadmin.org

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group