Re: Asynchronous I/O Support

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Raja Agrawal <raja(dot)agrawal(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Asynchronous I/O Support
Date: 2006-10-15 19:44:04
Message-ID: 20061015194404.GC30986@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 15, 2006 at 02:26:12PM -0400, Neil Conway wrote:
> On Sun, 2006-10-15 at 19:56 +0200, Martijn van Oosterhout wrote:
> > Sure, I even implemented it once. Didn't get any faster.
>
> Did you just do something akin to s/read/aio_read/ etc., or something
> more ambitious? I think that really taking advantage of the ability to
> have multiple I/O requests outstanding would take some leg work.

Sure. Basically, at certain strategic points in the code there were
extra ReadAsyncBuffer() commands (the IndexScan node and the b-tree
scan code). This command was allowed to do nothing, but if there were
not too many outstanding requests and a buffer was available, it would
allocate a buffer and initiate an AIO request for it.

IIRC there was a table of outstanding requests (I think I originally
allowed up to 32) and when a normal ReadBuffer() found the block had
already been requested, it "waited" on that block.

The principle was that the index-scan node would read a page full of
tids, submit a ReadAsyncBuffer() on each one, and then proceed as
normal. Fairly unintrusive patch all up. ifdeffing it out is safe, and
#defineing ReadAsyncBuffer() away causes the compiler to optimise the
loop away altogether.

The POSIX AIO layer sucks somewhat so it was tricky but it did work.
The hardest part is really how to decide if a buffer currently in the
buffercache is worth more than an asyncronously loaded buffer that may
not be used.

I posted the results ot -hackers some time ago, so you can always try
that.

> At least according to [1], kernel AIO on Linux still doesn't work for
> buffered (i.e. non-O_DIRECT) files. There have been patches available
> for quite some time that implement this, but I'm not sure when they are
> likely to get into the mainline kernel.

You can also do it by spawning off threads to do the requests. The
glibc emulation uses threads, but only allows one outstanding request
per file, which makes it useless for our purposes...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shane Ambler 2006-10-15 19:44:59 Re: Postgresql Caching
Previous Message mark 2006-10-15 19:07:01 Re: Postgresql Caching