Skip site navigation (1) Skip section navigation (2)

Re: developer.pgadmin.org/nagios.pgadmin.org - Diskfailure

From: "Dave Page" <dpage(at)vale-housing(dot)co(dot)uk>
To: "Jeff MacDonald" <jam(at)zoidtechnologies(dot)com>
Cc: <pgadmin-hackers(at)postgresql(dot)org>,"PostgreSQL WWW" <pgsql-www(at)postgresql(dot)org>
Subject: Re: developer.pgadmin.org/nagios.pgadmin.org - Diskfailure
Date: 2006-05-12 22:37:21
Message-ID: E7F85A1B5FF8D44C8A1AF6885BC9A0E401388168@ratbert.vale-housing.co.uk (view raw or flat)
Thread:
Lists: pgadmin-hackerspgsql-www
 

> -----Original Message-----
> From: Jeff MacDonald [mailto:jam(at)zoidtechnologies(dot)com] 
> Sent: 12 May 2006 23:19
> To: Dave Page
> Cc: Jeff MacDonald
> Subject: Re: [pgsql-www] 
> developer.pgadmin.org/nagios.pgadmin.org - Diskfailure
> 
> On Fri, 2006-05-12 at 22:47 +0100, Dave Page wrote:
> > The machine hosting the developer.pgadmin.org and 
> nagios.pgadmin.org 
> > vservers is currently having serious filesystem problems, which are 
> > causing disk intensive operations (like rsync, tar) to segfault for 
> > currently unknown reasons.
> 
> do a memory test, swap as needed, see if that solves the 
> problem.. 

I'll try just replacing it - I have some unopened sticks for that mobo.
FWIW, a reboot with a forced fsck found no errors at all and the box is
currently working OK, but I have now found errors similar to the
following:

May 12 21:11:29 barbas rsyncd[32134]: rsync: writefd_unbuffered failed
to write 4 bytes: phase "send_file_entry" [sender]: Broken pipe (32)
May 12 21:11:29 barbas rsyncd[32134]: rsync error: error in rsync
protocol data stream (code 12) at io.c(1126) [sender]
May 12 22:13:52 barbas kernel:  kernel BUG at page_alloc.c:142!
May 12 22:13:52 barbas kernel: invalid operand: 0000
May 12 22:13:52 barbas kernel: CPU:    1
May 12 22:13:52 barbas kernel: EIP:    0010:[<c013cec0>]    Not tainted
May 12 22:13:52 barbas kernel: EFLAGS: 00010286
May 12 22:13:52 barbas kernel: eax: d9e18100   ebx: c262c140   ecx:
c262c140   edx: 00000000
May 12 22:13:52 barbas kernel: esi: c262c140   edi: 00000000   ebp:
00000000   esp: d50d5edc
May 12 22:13:52 barbas kernel: ds: 0018   es: 0018   ss: 0018
May 12 22:13:52 barbas kernel: Process rsync (pid: 32141,
stackpage=d50d5000)
May 12 22:13:52 barbas kernel: Stack: d50d5ee8 c0133ab0 00001000
c262c140 e3a59d44 00006000 c01348e9 00000000
May 12 22:13:52 barbas kernel:        00000000 00001000 c262c140
e3a59d44 00000000 c013423d d50d5f7c c262c140
May 12 22:13:52 barbas kernel:        00000000 00001000 00001000
00000001 00000000 0000013b e3a59c80 c01347f0
May 12 22:13:52 barbas kernel: Call Trace:    [<c0133ab0>] [<c01348e9>]
[<c013423d>] [<c01347f0>] [<c01347f0>]
May 12 22:13:52 barbas kernel:   [<c0134a2f>] [<c01347f0>] [<c0145a50>]
[<c0108fdf>]
May 12 22:13:52 barbas kernel:
May 12 22:13:52 barbas kernel: Code: 0f 0b 8e 00 6b ba 37 c0 e9 ba fd ff
ff 8b 69 60 85 ed 0f 85

Could well be a duff stick I guess, given where it died.

> the quicker solution may be to just put the backup 
> machine into production rather than running exhaustive memory tests.

Yes, well it was going into it anyway to get it out of the current 3U
chassis and into a 1U one with full OOB management. The only problem is
that I'm still awaiting delivery of a cable for the external tape drive
in the rack so I can only do rsync/scp backups until that arrives.

Regards, Dave.

Responses

pgsql-www by date

Next:From: Travis HeinDate: 2006-05-13 02:04:54
Subject: Re: developer.pgadmin.org/nagios.pgadmin.org - Disk failure
Previous:From: Dave PageDate: 2006-05-12 21:47:52
Subject: developer.pgadmin.org/nagios.pgadmin.org - Disk failure

pgadmin-hackers by date

Next:From: Travis HeinDate: 2006-05-13 02:04:54
Subject: Re: developer.pgadmin.org/nagios.pgadmin.org - Disk failure
Previous:From: Dave PageDate: 2006-05-12 21:47:52
Subject: developer.pgadmin.org/nagios.pgadmin.org - Disk failure

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group