Skip site navigation (1) Skip section navigation (2)

Re: hanging for 30sec when checkpointing

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Peter Galbavy <peter(dot)galbavy(at)knowtion(dot)net>
Cc: Iain <iain(at)mst(dot)co(dot)jp>, Shane Wright <me(at)shanewright(dot)co(dot)uk>,<pgsql-admin(at)postgresql(dot)org>
Subject: Re: hanging for 30sec when checkpointing
Date: 2004-02-10 16:08:28
Message-ID: Pine.LNX.4.33.0402100856040.28531-100000@css120.ihs.com (view raw or flat)
Thread:
Lists: pgsql-admin
On Tue, 10 Feb 2004, Peter Galbavy wrote:

> scott.marlowe wrote:
> > I don't know who you think you are, but I've physically tested the
> > stuff I'm talking about.  Care to qualify what you mean?
> 
> I would genuinely be interested in seeing the results and the methodology.
> 
> > IDE drives (all the ones I've ever tested) LIE about their write
> > caches and fsync.  don't believe me?  Simple, hook one up, initiate
> > 100 parallel transactions, pull the power plug, watch your database
> > fail to come back up due to the corruption caused by the LYING IDE
> > drives.
> 
> See my comment/question below.
> 
> > Do the same with SCSI.  watch the database come right back to life.
> >
> > If you're gonna accuse me of lying, you damned well better have the
> > balls AND evidence to back it up.
> 
> I am NOT accussing anyone of lying, least of all people I don't personally
> know, and certainly not you. What I am referring to is over-generalisation.
> You made a long and detailed generalisation, without detailing anything.

Oh, spreading misinformation isn't lying?  You live in a different world 
than I do.

>From www.dictionary.com:

misinformation

\Mis*in`for*ma"tion\, n. Untrue or incorrect information. --Bacon.

Source: Webster's Revised Unabridged Dictionary, © 1996, 1998 MICRA, Inc.

> My primary question, without seeing the way you did it, is can you comment
> on whether you wrote your own testbed or did you rely on potentially flawed
> OS interfaces ? Did you use a signal analyser ?

Last year, I and several others on the pgsql lists ran a series of tests 
to determine which drive subsystems could survive power off tests.  We 
ran the tests by initiating dozens or hundreds of simultaneous 
transactions against a postgresql machine, then pulling the plug in the 
middle.

Due to the nature of postgresql, if a drive reports an fsync before it has 
actually written out its cache, the database will be corrupted and refuse 
to startup when the machine is powered back up.  Signal analyzers are 
nice, but if the database doesn't work, it doesn't really matter what the 
sig an says.  If you'd like to set one up and test that way be my guess, 
but, the "rubber hitting the road" is when you simulate the real thing, 
losing power during transactions.

Here's what we found:

SCSI drives, (at least all the ones we tested, I tested Seagate 18 
gig 10krpm barracudas, many others were tested) as a group, passed the 
test with flying colors.  No one at that time found a single SCSI drive 
that failed it.

IDE drives, with write cache enabled, failed 100% of the time.

IDE drives, with write cache disabled, passed 100% of the time.

SCSI RAID controllers with battery backed cache set to write back passed.

The IDE RAID controller from Escalade passed.  I don't recall if we ever 
found out if it had battery backed cache, or if it disabled the cache on 
the drives.

Performance wise, the IDEs were neck and neck with the SCSI drives when 
they had their write caches enabled.  When the write cache was disabled, 
their performance was about 1/4 to 1/3 as fast as the SCSI drives.

The SCSI RAID card (lsi megaraid is what I tested, someone else tested 
the adaptec) with battery backed cache as well as the escalade were 
great performers.

> Now, I have *not* done the tests - hence my real interest, but I have had at
> least as many problems with SCSI sub-systems as with IDE over the years.
> Probably more actually. Ever since using IBM EIDE drives (the 75GXP
> included, I am a lucky one) I have had very little, knock on wood, to worry
> about even during power failures.

I've built servers with both IDE and SCSI in the past 5 years, and my 
experience has been that while IDE is fine for file / print servers, it's 
a disaster waiting to happen under postgresql.

Keep in mind, we're not talking drive failure rates, or cabling / 
termination issues here, we're talking about the fact that with IDE drives 
(without the escalade controller) you have two choices, fast, or safe.

With SCSI, you get both.  With the RAID controllers mentioned you have 
both.

While my post may have seemed like simple uninformed opinion to you at the 
time you read it, it was, in fact, backed up by weeks of research by both 
myself and a handful of other people on the postgresql mailing lists.  
Your extremely flippant remark could just as easily have been a request 
for more information on how I had reached those conclusions, but no.  It 
had to be an accusation of lying.  And that IS what it was.  No amount of 
hand waving by you can change the fact that you accused me of 
dissemenating misinformation, which is dissemenating untruths, which is 
lying.


In response to

Responses

pgsql-admin by date

Next:From: scott.marloweDate: 2004-02-10 16:12:40
Subject: Re: hanging for 30sec when checkpointing
Previous:From: Jouneau LucDate: 2004-02-10 15:27:05
Subject: Server log parser

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group