Quick Links

Re: Performance degradation in commit ac1d794

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Васильев Дмитрий <d(dot)vasilyev(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Performance degradation in commit ac1d794
Date:	2016-03-21 05:09:18
Message-ID:	20160321050918.GA6781@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2016-03-21 11:52:43 +1300, Thomas Munro wrote:
> The contract that I invented here is that an async-aware parent node
> can ask any child node "are you ready?" and get back various answers
> including an fd which means please don't call ExecProcNode until this
> fd is ready to read. But after ExecProcNode is called, the fd must
> not be accessed again (the subplan has the right to close it, return a
> different one next time etc), so it must not appear in any
> WaitEventSet wait on after that.

Hm, why did you choose to go with that contract? Why do children need to
switch fds and such?

> As you can see, in append_next_async_wait, it therefore needs to
> create an new WaitEventSet every time it needs to wait, which makes it
> feel more like select() than epoll(). Ideally it'd just have just one
> single WaitEventSet for the lifetime of the append node, and just add
> and remove fds as required.

Yes, I see. I think that'd be less efficient than not manipulating the
set all the time, but still a lot more efficient than adding/removing
the postmaster death latch every round.

Anyway, I want to commit this ASAP, so I can get on to some patches in
the CF. But I'd welcome you playing with adding that ability.

> On the subject of short-lived and frequent calls to WaitLatchOrSocket,
> I wonder if there would be some benefit in reusing a statically
> allocated WaitEventSet for that.

Wondered the same. It looks like there'd be a number (just latch, latch
+ postmaster death), for it to be beneficial. Most latch waits aren't
that frequent though, so I've decided not to initially go there.

> That would become possible if you
> could add and remove the latch and socket as discussed, with an
> opportunity to skip the modification work completely if the reusable
> WaitEventSet already happens to have the right stuff in it from the
> last WaitLatchOrSocket call. Or maybe the hot wait loops should
> simply be rewritten to reuse a WaitEventSet explicitly so they can
> manage that...

Yea, I think that's better.

> * It looks like epoll (and maybe kqueue?) can associate user data with
> an event and give it back to you; if you have a mapping between fds
> and some other object (in the case of the attached patch, subplan
> nodes that are ready to be pulled), would it be possible and useful to
> expose that (and simulate it where needed) rather than the caller
> having to maintain associative data structures (see the linear search
> in my patch)?

We already use the pointer that epoll gives you. But note that the
'WaitEvent' struct already contains the original position the event was
registered at. That's filled in when you call WaitEventSetWait(). So
you could either just build an array based on ->pos, or we could also
add a void *private; or such to the definition.

> * I would be interested in writing a kqueue implementation of this for
> *BSD (and MacOSX?) at some point if someone doesn't beat me to it.

I hoped that somebody would do that - that'd afaics be the only major
API missing.

> * It would be very cool to get some view into which WaitEventSetWait
> call backends are waiting in from a system view if it could be done
> cheaply enough. A name/ID for the call site, and an indication of
> which latch and how many fds... or something.

That's not something I plan to tackle today; we don't have wait event
integration for latches atm. There's some threads somewhere about this,
but that seems mostly independent of the way latches are implemented.

Thanks,

Andres

In response to

Re: Performance degradation in commit ac1d794 at 2016-03-20 22:52:43 from Thomas Munro

Responses

Re: Performance degradation in commit ac1d794 at 2016-03-21 18:51:20 from Thomas Munro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Petr Jelinek	2016-03-21 05:26:25	Re: Sequence Access Method WIP
Previous Message	Amit Kapila	2016-03-21 04:56:04	Re: Performance degradation in commit ac1d794