Re: developer.pgadmin.org/nagios.pgadmin.org - Disk failure

From: "Dave Page" <dpage(at)vale-housing(dot)co(dot)uk>
To: <travis(dot)hein(at)travnet(dot)org>, <pgsql-www(at)postgresql(dot)org>
Subject: Re: developer.pgadmin.org/nagios.pgadmin.org - Disk failure
Date: 2006-05-13 19:29:21
Message-ID: 001601c676c3$8963022f$6a01a8c0@valehousing.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www


-----Original Message-----
From: "Travis Hein"<travis(dot)hein(at)travnet(dot)org>
Sent: 13/05/06 03:11:53
To: "pgsql-www(at)postgresql(dot)org"<pgsql-www(at)postgresql(dot)org>
Subject: Re: [pgsql-www] developer.pgadmin.org/nagios.pgadmin.org - Disk failure

> Well, sorry to hear things are funny there, and it is probably not related to
> my colorful ranting.

:-)

This box has been fine for 18 months or so, so I doubt it's the same issue as yours - still, it's always interesting to hear of others' experiences.

> But I have some space you can borrow to stuff things on, if it is helpful.

Thanks - space isn't a problem though so I should be Ok.

Cheers, Dave.

-----Unmodified Original Message-----
I used to have a pair of old SCSI drives, in software RAID1. I never used
reiserfs, it was ext2 of the day. It worked great for most of the time,
except when I did backups, where there was more bus or bulk activity. At
first I went nuts thinking the scsi tape drive was badly terminated or
wreaking havoc on the bus, but then I found the same problem happened with
network backups and backups to IDE drive.
The issue was the kernel scsi card driver was using tag command queueing, but
one of my drives didn't know what to do with those, whilst the other drive
did support tag command queueing. I am not a scsi scientist, but my best
theory was that there was some evil eventual something breaking under the
high loads because of the different TCQ support, and the software RAID1
didn't know what to do then.
I never did fix it, i moved to new hardware and abandoned the system all
together.

Well, sorry to hear things are funny there, and it is probably not related to
my colorful ranting.

But I have some space you can borrow to stuff things on, if it is helpful.

$>df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 337G 196G 141G 59% /
/dev/sdb1 231G 184G 48G 80% /mnt/backup

/ the 141G is 6 element RAID 5 on ext3, with rsync backup to the /mnt/backup,
which is a usb drive, but it works :)

let me know if there is anything I can do.

On Friday 12 May 2006 17:47, Dave Page wrote:
> The machine hosting the developer.pgadmin.org and nagios.pgadmin.org
> vservers is currently having serious filesystem problems, which are
> causing disk intensive operations (like rsync, tar) to segfault for
> currently unknown reasons. If you commit to the pgAdmin SVN, please hold
> off for a while, or if you are working on other projects on the machine,
> please don't for now!
>
> If anyone has any idea what might cause ReiserFS to die horribly like
> this, whilst the RAID1 disks don't so much as squeak in the wrong way,
> I'd love to hear it!!
>
> Anyhoo, I have backups, and a replacement machine sitting in the wings
> so I should be able to get things sorted early next week.
>
> Regards, Dave
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Only those who attempt the absurd can achieve the impossible.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Browse pgsql-www by date

  From Date Subject
Next Message Enrico 2006-05-15 09:36:36 [webmaster] www.psql.it
Previous Message Dave Page 2006-05-13 19:29:19 Re: [pgadmin-hackers] developer.pgadmin.org/nagios.pgadmin.org - Diskfailure