Re: Prereading using posix_fadvise (was Re: Commitfest patches)

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Zeugswetter Andreas OSB SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Gregory Stark <stark(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Prereading using posix_fadvise (was Re: Commitfest patches)
Date: 2008-03-28 15:58:24
Message-ID: 20080328155824.GD9150@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 28, 2008 at 11:41:58AM -0400, Bruce Momjian wrote:
> Should we consider only telling the kernel X pages ahead, meaning when
> we are on page 10 we tell it about page 16?

It's not so interesting for sequential reads, the kernel can work that
out for itself. Disk reads are usually in blocks of at least 128K
anyway, so there's a real good chance you have block 16 already.

The interesting case is index scan, where you so a posix_fadvise() on
the next block *before* returning the items in the current block. Then
by the time you've processed these tuples, the next block will
hopefully have been read in and we can proceed without delay.

Or fadvising all the tuples referred to from an index page at once so
the kernel can determine the optimal order to fetch them. The read()
will still be in order of the tuple, but the delay will (hopefully) be
less.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-03-28 15:59:35 Re: Prereading using posix_fadvise (was Re: Commitfest patches)
Previous Message Bruce Momjian 2008-03-28 15:45:51 Re: Commitfest patches