Re: asynchronous execution

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous execution
Date: 2016-10-04 11:53:50
Message-ID: CAJ3gD9f5t_ZP7RKv6r-Am=KPuui0JuLpOEhQfc3O3ODD0T6tmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 October 2016 at 02:30, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Sep 28, 2016 at 12:30 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>> On 24 September 2016 at 06:39, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Since Kyotaro Horiguchi found that my previous design had a
>>> system-wide performance impact due to the ExecProcNode changes, I
>>> decided to take a different approach here: I created an async
>>> infrastructure where both the requestor and the requestee have to be
>>> specifically modified to support parallelism, and then modified Append
>>> and ForeignScan to cooperate using the new interface. Hopefully that
>>> means that anything other than those two nodes will suffer no
>>> performance impact. Of course, it might have other problems....
>>
>> I see that the reason why you re-designed the asynchronous execution
>> implementation is because the earlier implementation showed
>> performance degradation in local sequential and local parallel scans.
>> But I checked that the ExecProcNode() changes were not that
>> significant as to cause the degradation.
>
> I think we need some testing to prove that one way or the other. If
> you can do some - say on a plan with multiple nested loop joins with
> inner index-scans, which will call ExecProcNode() a lot - that would
> be great. I don't think we can just rely on "it doesn't seem like it
> should be slower"
Agreed. I will come up with some tests.

> , though - ExecProcNode() is too important a function
> for us to guess at what the performance will be.

Also, parent pointers are not required in the new design. Thinking of
parent pointers, now it seems the event won't get bubbled up the tree
with the new design. But still, , I think it's possible to switch over
to the other asynchronous tree when some node in the current subtree
is waiting. But I am not sure, will think more on that.

>
> The thing I'm really worried about with either implementation is what
> happens when we start to add asynchronous capability to multiple
> nodes. For example, if you imagine a plan like this:
>
> Append
> -> Hash Join
> -> Foreign Scan
> -> Hash
> -> Seq Scan
> -> Hash Join
> -> Foreign Scan
> -> Hash
> -> Seq Scan
>
> In order for this to run asynchronously, you need not only Append and
> Foreign Scan to be async-capable, but also Hash Join. That's true in
> either approach. Things are slightly better with the original
> approach, but the basic problem is there in both cases. So it seems
> we need an approach that will make adding async capability to a node
> really cheap, which seems like it might be a problem.

Yes, we might have to deal with this.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chenxi Li 2016-10-04 12:05:06 Cardinality estimation for group by
Previous Message Heikki Linnakangas 2016-10-04 11:47:29 Re: Logical tape pause/resume