Re: Sequential Scan Read-Ahead

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential Scan Read-Ahead
Date: 2002-04-25 10:47:13
Message-ID: Pine.NEB.4.43.0204251937500.3111-100000@angelic.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 25 Apr 2002, Lincoln Yeoh wrote:

> I think the raw partitions will be more trouble than they are worth.
> Reading larger chunks at appropriate circumstances seems to be the "low
> hanging fruit".

That's certainly a good start. I don't know if the raw partitions
would be more trouble than they are worth, but it certainly would
be a lot more work, yes. One could do pretty much as well, I think,
by using the "don't buffer blocks for this file" option on those
OSes that have it.

> [1] The theory was the drive typically has to jump around a lot more for
> metadata than for files. In practice it worked pretty well, if I do say so
> myself :). Not sure if modern HDDs do specialized O/S metadata caching
> (wonder how many megabytes would typically be needed for 18GB drives :) ).

Sure they do, though they don't necessarially read it all. Most
unix systems have special cache for namei lookups (turning a filename
into an i-node number), often one per-process as well as a system-wide
one. And on machines with a unified buffer cache for file data,
there's still a separate metadata cache.

But in fact, at least with BSD's FFS, there's probably not quite
as much jumping as you'd think. An FFS filesystem is divided into
"cylinder groups" (though these days the groups don't necessarially
match the physical cylinder boundaries on the disk) and a file's
i-node entry is kept in the same cylinder group as the file's data,
or at least the first part of the it.

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lee Kindness 2002-04-25 11:42:00 ECPG: FETCH ALL|n FROM cursor - Memory allocation?
Previous Message Curt Sampson 2002-04-25 09:19:02 Re: Sequential Scan Read-Ahead