Re: How to improve db performance with $7K?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Brown <kevin(at)sysexperts(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: How to improve db performance with $7K?
Date: 2005-04-15 02:41:56
Message-ID: 28523.1113532916@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> Tom Lane wrote:
>> The reason this is so much more of a win than it was when ATA was
>> designed is that in modern drives the kernel has very little clue about
>> the physical geometry of the disk. Variable-size tracks, bad-block
>> sparing, and stuff like that make for a very hard-to-predict mapping
>> from linear sector addresses to actual disk locations.

> What I mean is that when it comes to scheduling disk activity,
> knowledge of the specific physical geometry of the disk isn't really
> important.

Oh?

Yes, you can probably assume that blocks with far-apart numbers are
going to require a big seek, and you might even be right in supposing
that a block with an intermediate number should be read on the way.
But you have no hope at all of making the right decisions at a more
local level --- say, reading various sectors within the same cylinder
in an optimal fashion. You don't know where the track boundaries are,
so you can't schedule in a way that minimizes rotational latency.
You're best off to throw all the requests at the drive together and
let the drive sort it out.

This is not to say that there's not a place for a kernel-side scheduler
too. The drive will probably have a fairly limited number of slots in
its command queue. The optimal thing is for those slots to be filled
with requests that are in the same area of the disk. So you can still
get some mileage out of an elevator algorithm that works on logical
block numbers to give the drive requests for nearby block numbers at the
same time. But there's also a lot of use in letting the drive do its
own low-level scheduling.

> My argument is that a sufficiently smart kernel scheduler *should*
> yield performance results that are reasonably close to what you can
> get with that feature. Perhaps not quite as good, but reasonably
> close. It shouldn't be an orders-of-magnitude type difference.

That might be the case with respect to decisions about long seeks,
but not with respect to rotational latency. The kernel simply hasn't
got the information.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Brown 2005-04-15 05:03:36 Re: How to improve db performance with $7K?
Previous Message Alex Turner 2005-04-15 02:24:22 Re: How to improve db performance with $7K?