Re: Performance degradation in commit ac1d794

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Васильев Дмитрий <d(dot)vasilyev(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance degradation in commit ac1d794
Date: 2016-03-20 22:52:43
Message-ID: CAEepm=1CuAWfxDk==jZ7pgCDCv52fiUnDSpUvmznmVmRKU5zpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 21, 2016 at 3:46 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2016-03-21 01:31:30 +1300, Thomas Munro wrote:
>> I couldn't get the second patch to apply for some reason,
>
> Weird? Even efter appying the first one first?
>
>
>> but I have been trying out your "latch" branch on some different OSs
>> and porting some code that does a bunch of waiting on many sockets
>> over to this API to try it out.
>
>> One thing that I want to do but can't with this interface is remove an
>> fd from the set.
>
> Yea, I didn't see an in-core need for that, so I didn't want to add
> that yet.
>
> Other than that, how did things go?

So far, so good. No surprises.

>> I can AddWaitEventToSet returning a position, and I
>> can ModifyWait to provide new event mask by position including zero
>> mask, I can't actually remove the fd (for example to avoid future
>> error events that can't be masked, or to allow that fd to be closed
>> and perhaps allow that fd number to coincidentally be readded later,
>> and just generally to free the slot). There is an underlying way to
>> remove an fd from a set with poll (sort of), epoll, kqueue. (Not sure
>> about Windows. But surely...). I wonder if there should be
>> RemoveWaitEventFromSet(set, position) which recycles event slots,
>> sticking them on a freelist (and setting corresponding pollfd structs'
>> fd to -1).
>
> I'm inclined that if we add this, we memmmove everything later, and
> adjust the offsets. As we e.g. pass the entire pollfd array to poll() at
> once, not having gaps will be more important.
>
> What's the use case where you need that?

I was experimenting with a change to Append so that it could deal with
asynchronous subplans, and in particular support for asynchronous
foreign scans. See attached patch which applies on top of your latch
branch (or your two patches above, I assume) which does that. It is
not a very ambitious form of asynchrony and there are opportunities to
do so much more in this area (about which more later in some future
thread), but it is some relevant educational code I had to hand, so I
ported it to your new API as a way to try the API out.

The contract that I invented here is that an async-aware parent node
can ask any child node "are you ready?" and get back various answers
including an fd which means please don't call ExecProcNode until this
fd is ready to read. But after ExecProcNode is called, the fd must
not be accessed again (the subplan has the right to close it, return a
different one next time etc), so it must not appear in any
WaitEventSet wait on after that. As you can see, in
append_next_async_wait, it therefore needs to create an new
WaitEventSet every time it needs to wait, which makes it feel more
like select() than epoll(). Ideally it'd just have just one single
WaitEventSet for the lifetime of the append node, and just add and
remove fds as required.

>> I wonder if it would be useful to reallocate the arrays as needed so
>> there isn't really a hard limit to the number of things you add, just
>> an initial size.
>
> Doubles the amount of palloc()s required. As lots of these sets are
> actually going to be very short-lived (a single WaitLatch call), that's
> something I'd rather avoid. Although I guess you could just allocate
> separate arrays at that moment.

On the subject of short-lived and frequent calls to WaitLatchOrSocket,
I wonder if there would be some benefit in reusing a statically
allocated WaitEventSet for that. That would become possible if you
could add and remove the latch and socket as discussed, with an
opportunity to skip the modification work completely if the reusable
WaitEventSet already happens to have the right stuff in it from the
last WaitLatchOrSocket call. Or maybe the hot wait loops should
simply be rewritten to reuse a WaitEventSet explicitly so they can
manage that...

Some other assorted thoughts:

* It looks like epoll (and maybe kqueue?) can associate user data with
an event and give it back to you; if you have a mapping between fds
and some other object (in the case of the attached patch, subplan
nodes that are ready to be pulled), would it be possible and useful to
expose that (and simulate it where needed) rather than the caller
having to maintain associative data structures (see the linear search
in my patch)?

* I would be interested in writing a kqueue implementation of this for
*BSD (and MacOSX?) at some point if someone doesn't beat me to it.

* It would be very cool to get some view into which WaitEventSetWait
call backends are waiting in from a system view if it could be done
cheaply enough. A name/ID for the call site, and an indication of
which latch and how many fds... or something.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
async-append-wait-set-hack.patch application/octet-stream 23.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2016-03-20 22:55:49 Re: WIP: Upper planner pathification
Previous Message Jim Nasby 2016-03-20 22:46:31 Re: oldest xmin is far in the past