Re: Write cache

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: <ohp(at)pyrenet(dot)fr>, "'pgsql-hackers list'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Write cache
Date: 2004-01-28 14:56:40
Message-ID: 003301c3e5ae$f04575a0$efb887d9@LaptopDellXP
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Olivier PRENANT writes...
>
> Because I've lost a lot of data using postgresql (and I know for sure
this
> should'nt happen) I've gone a bit further reading documentations on my
> disks and...
>

The bottom line here is that Olivier has lost some data and I'm sure we
all want to know if there is a bug in PostgreSQL, or he has a hardware
problem. However, PostgreSQL is partially implicated only because it
discovered the error, but hasn't in any other way been associated yet
with the fatal crash itself.

My intuition tells me that this is hardware related. We've discussed
some probable causes, but nobody has come up with a diagnostic test to
evaluate the disks accuracy. This might be because this forum isn't the
most appropriate place to discuss disk storage or linux device drivers?

Olivier: if your disks are supported or under warranty, then my advice
would be to contact these people and ask for details of a suitable
diagnostic test, or go via their support forums to research this.
Expensive disks are usually fairly well supported, especially if they
smell an upgrade. :)

My experience with other RDBMS vendor's support teams is that they give
out this advice regularly when faced with RDBMS-reported data corruption
errors: "check your disks are working"; I think it is reasonable to do
the same here. Data corruption by the dbms does occur, but my experience
is that this is frequent than hardware-related causes. In the past, I
have used the dd command to squirt data at the disk, then read it back
again - but there may be reasons I don't know why a success on that test
might not be conclusive, so I personally would be happy to defer to
someone that does. I've seen errors like this come from soon-to-fail
disks, poor device drivers, failing non-volatile RAM, cabinet backplane
noise, poorly wired cabling and intermittently used shared SCSI...

Best of luck, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-01-28 14:59:11 Re: Question about indexes
Previous Message Greg Stark 2004-01-28 14:13:47 Re: Question about indexes