Re: Extended Prefetching using Asynchronous IO - proposal and patch

From: johnlumby <johnlumby(at)hotmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extended Prefetching using Asynchronous IO - proposal and patch
Date: 2014-06-01 00:44:28
Message-ID: BLU436-SMTP6AB1FF61C05A13A944357A3210@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 05/30/14 09:36, Claudio Freire wrote:
> On Fri, May 30, 2014 at 4:15 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> I see no such a rule in the man pages of any of the functions
involved. And it wouldn't matter anyway; the behavior is exactly the
same if you aio_error() first.

You are completely correct and I was wrong.
For the case of different-process,
It is not the order of the aio calls that makes it work/not work,
it is whether it polls with a timeout.
I think I was confusing this with a rule concerning order for
calling aio_return.

>>>>> We don't want polling... And even if we did, calling aio_suspend() in a
>>>>> way
>>>>> that's known to be broken, in a loop, is a pretty crappy way of polling.

I have made a change to the complete_aio functions
to use polling *only* if the caller is not the originator of the aio_read.
If it is the originator, it now calls aio_suspend with no timeout
(i.e. wait till complete).
The loop also now checks for the EINTR case which someone
pointed out.

In my test runs, with a debug message in FileCompleteaio to tell me
whether caller is or is not the originator of the aio_read,
I see > 99.8% of calls are from originator and only < 0.2% are not.
e.g. (samples from two backends)

different : 10
same : 11726

different : 38
same : 12105

new patch based on today 140531 is attached,

This improves one of my benchmarks by about 10% throughput,
and now shows an overall 23% improvement relative to existing code with
posix_fadvise.

So hopefully this addresses your performance concern.

If you look at the new patch, you'll see that for the different-pid case,
I still call aio_suspend with a timeout.
As you or Claudio pointed out earlier, it could just as well sleep
for the same timeout,
but the small advantage of calling aio_suspend is if the io completed
just between
the aio_error returning EINPROGRESS and the aio_suspend call.
Also it makes the code simpler. In fact this change is quite small,
just a few lines
in backend/storage/buffer/buf_async.c and backend/storage/file/fd.c

Based on this, I think it is not necessary to get rid of the polling
altogether
(and in any case, as far as I can see, very difficult).

>>>
>>> Well, as mentioned earlier, it is not broken. Whether it is
>>> efficient I am not sure.
>>> I have looked at the mutex in aio_suspend that you mentioned and I am not
>>> quite convinced that, if caller is not the original aio_read process,
>>> it renders the suspend() into an instant timeout. I will see if I can
>>> verify that.
>>
>> I don't see the point of pursuing this design further. Surely we don't want
>> to use polling here, and you're relying on undefined behavior anyway. I'm
>> pretty sure aio_return/aio_error won't work from a different process on all
>> platforms, even if it happens to work on Linux. Even on Linux, it might stop
>> working if the underlying implementation changes from the glibc pthread
>> emulation to something kernel-based.

Good point. I have included the guts of your little test program
(modified to do polling) into the existing autoconf test program
that decides on the
#define USE_AIO_ATOMIC_BUILTIN_COMP_SWAP.
See config/c-library.m4.
I hope this goes some way to answer your concern about robustness,
as at least now if the implementation changes in some way that
renders the polling ineffective, it will be caught in configure.

> I'll try to do some measuring of performance with:
> a) git head
> b) git head + patch as-is
> c) git head + patch without aio_suspend in foreign processes (just re-read)
> d) git head + patch with a lwlock (or whatever works) instead of aio_suspend
>
> a-c will be the fastest, d might take some while.
>
> I'll let you know of the results as I get them.
>
>
Claudio, I am not quite sure if what I am submitting now is
quite the same as any of yours. As I promised before, but have
not yet done, I will package one or two of my benchmarks and
send them in.

Attachment Content-Type Size
postgresql-9.4.140531.async_io_prefetching.patch text/x-patch 310.1 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message johnlumby 2014-06-01 01:03:47 Re: Extended Prefetching using Asynchronous IO - proposal and patch
Previous Message Michael Paquier 2014-05-31 15:58:42 Re: PosgtresSQL master-slave synchronous replication to multiple slaves

Browse pgsql-hackers by date

  From Date Subject
Next Message johnlumby 2014-06-01 01:03:47 Re: Extended Prefetching using Asynchronous IO - proposal and patch
Previous Message Michael Paquier 2014-05-31 15:34:34 Re: Re: Logical slots not mentioned in CREATE_REPLICATION_SLOT for replication protocol docs