Re: hanging for 30sec when checkpointing

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Peter Galbavy <peter(dot)galbavy(at)knowtion(dot)net>
Cc: Iain <iain(at)mst(dot)co(dot)jp>, Shane Wright <me(at)shanewright(dot)co(dot)uk>, <pgsql-admin(at)postgresql(dot)org>
Subject: Re: hanging for 30sec when checkpointing
Date: 2004-02-10 16:08:28
Message-ID: Pine.LNX.4.33.0402100856040.28531-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, 10 Feb 2004, Peter Galbavy wrote:

> scott.marlowe wrote:
> > I don't know who you think you are, but I've physically tested the
> > stuff I'm talking about. Care to qualify what you mean?
>
> I would genuinely be interested in seeing the results and the methodology.
>
> > IDE drives (all the ones I've ever tested) LIE about their write
> > caches and fsync. don't believe me? Simple, hook one up, initiate
> > 100 parallel transactions, pull the power plug, watch your database
> > fail to come back up due to the corruption caused by the LYING IDE
> > drives.
>
> See my comment/question below.
>
> > Do the same with SCSI. watch the database come right back to life.
> >
> > If you're gonna accuse me of lying, you damned well better have the
> > balls AND evidence to back it up.
>
> I am NOT accussing anyone of lying, least of all people I don't personally
> know, and certainly not you. What I am referring to is over-generalisation.
> You made a long and detailed generalisation, without detailing anything.

Oh, spreading misinformation isn't lying? You live in a different world
than I do.

>From www.dictionary.com:

misinformation

\Mis*in`for*ma"tion\, n. Untrue or incorrect information. --Bacon.

Source: Webster's Revised Unabridged Dictionary, © 1996, 1998 MICRA, Inc.

> My primary question, without seeing the way you did it, is can you comment
> on whether you wrote your own testbed or did you rely on potentially flawed
> OS interfaces ? Did you use a signal analyser ?

Last year, I and several others on the pgsql lists ran a series of tests
to determine which drive subsystems could survive power off tests. We
ran the tests by initiating dozens or hundreds of simultaneous
transactions against a postgresql machine, then pulling the plug in the
middle.

Due to the nature of postgresql, if a drive reports an fsync before it has
actually written out its cache, the database will be corrupted and refuse
to startup when the machine is powered back up. Signal analyzers are
nice, but if the database doesn't work, it doesn't really matter what the
sig an says. If you'd like to set one up and test that way be my guess,
but, the "rubber hitting the road" is when you simulate the real thing,
losing power during transactions.

Here's what we found:

SCSI drives, (at least all the ones we tested, I tested Seagate 18
gig 10krpm barracudas, many others were tested) as a group, passed the
test with flying colors. No one at that time found a single SCSI drive
that failed it.

IDE drives, with write cache enabled, failed 100% of the time.

IDE drives, with write cache disabled, passed 100% of the time.

SCSI RAID controllers with battery backed cache set to write back passed.

The IDE RAID controller from Escalade passed. I don't recall if we ever
found out if it had battery backed cache, or if it disabled the cache on
the drives.

Performance wise, the IDEs were neck and neck with the SCSI drives when
they had their write caches enabled. When the write cache was disabled,
their performance was about 1/4 to 1/3 as fast as the SCSI drives.

The SCSI RAID card (lsi megaraid is what I tested, someone else tested
the adaptec) with battery backed cache as well as the escalade were
great performers.

> Now, I have *not* done the tests - hence my real interest, but I have had at
> least as many problems with SCSI sub-systems as with IDE over the years.
> Probably more actually. Ever since using IBM EIDE drives (the 75GXP
> included, I am a lucky one) I have had very little, knock on wood, to worry
> about even during power failures.

I've built servers with both IDE and SCSI in the past 5 years, and my
experience has been that while IDE is fine for file / print servers, it's
a disaster waiting to happen under postgresql.

Keep in mind, we're not talking drive failure rates, or cabling /
termination issues here, we're talking about the fact that with IDE drives
(without the escalade controller) you have two choices, fast, or safe.

With SCSI, you get both. With the RAID controllers mentioned you have
both.

While my post may have seemed like simple uninformed opinion to you at the
time you read it, it was, in fact, backed up by weeks of research by both
myself and a handful of other people on the postgresql mailing lists.
Your extremely flippant remark could just as easily have been a request
for more information on how I had reached those conclusions, but no. It
had to be an accusation of lying. And that IS what it was. No amount of
hand waving by you can change the fact that you accused me of
dissemenating misinformation, which is dissemenating untruths, which is
lying.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message scott.marlowe 2004-02-10 16:12:40 Re: hanging for 30sec when checkpointing
Previous Message Jouneau Luc 2004-02-10 15:27:05 Server log parser