Plug-pull testing worked, diskchecker.pl failed

From: Chris Angelico <rosuav(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Plug-pull testing worked, diskchecker.pl failed
Date: 2012-10-22 13:17:13
Message-ID: CAPTjJmrDuEjvBo00XvLMLOQ5vdXfFo33K1Gmvzvj0KiiahnjEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

After reading the comments last week about SSDs, I did some testing of
the ones we have at work - each of my test-boxes (three with SSDs, one
with HDD) subjected to multiple stand-alone plug-pull tests, using
pgbench to provide load. So far, there've been no instances of
PostgreSQL data corruption, but diskchecker.pl reported huge numbers
of errors.

What exactly does this mean? Is Postgres doing something that
diskchecker isn't, and is thus safe? Could data corruption occur but
I've just never pulled the power out at the precise microsecond when
it would cause problems? Or is it that we would lose entire
transactions, but never experience corruption that the postmaster
can't repair?

Interestingly, disabling write-caching with 'hdparm -W 0 /dev/sda' (as
per the llivejournal blog[1]) reduced the SSD's error rates without
eliminating failures entirely, while on the HDD, there were no
problems at all with write caching off.

ChrisA

Responses

Browse pgsql-general by date

  From Date Subject
Next Message chinnaobi 2012-10-22 13:57:30 Re: Streaming replication failed to start scenarios
Previous Message Albe Laurenz 2012-10-22 12:52:47 Re: Revert TRUNCATE CASCADE?