Re: Sequential Scan Read-Ahead

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential Scan Read-Ahead
Date: 2002-04-25 01:56:41
Message-ID: 200204250156.g3P1ufh05751@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Curt Sampson wrote:
> On Wed, 24 Apr 2002, Bruce Momjian wrote:
>
> > We expect the file system to do re-aheads during a sequential scan.
> > This will not happen if someone else is also reading buffers from that
> > table in another place.
>
> Right. The essential difficulties are, as I see it:
>
> 1. Not all systems do readahead.

If they don't, that isn't our problem. We expect it to be there, and if
it isn't, the vendor/kernel is at fault.

> 2. Even systems that do do it cannot always reliably detect that
> they need to.

Yes, seek() in file will turn off read-ahead. Grabbing bigger chunks
would help here, but if you have two people already reading from the
same file, grabbing bigger chunks of the file may not be optimal.

> 3. Even when the read-ahead does occur, you're still doing more
> syscalls, and thus more expensive kernel/userland transitions, than
> you have to.

I would guess the performance impact is minimal.

> Has anybody considered writing a storage manager that uses raw
> partitions and deals with its own buffer caching? This has the potential
> to be a lot more efficient, since the database server knows much more
> about its workload than the operating system can guess.

We have talked about it, but rejected it. Look in TODO.detail in
optimizer and performance for 'raw'. Also interesting info there about
optimizer cost estimates we have been talking about.

Specificially see:

http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg00009.html

Also see:

http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-04-25 01:59:15 Re: Vote on SET in aborted transaction
Previous Message Hiroshi Inoue 2002-04-25 01:49:04 Re: Vote on SET in aborted transaction